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(54) Title: METHODS FOR RAPID IDENTIFICATION OF PATHOGENS IN HUMANS AND ANIMALS 
(57) Abstract: The present invention provides methods of: identifying pathogens in biological s< 



o f" m humans and animals, 

resolving a plurality of etiologic agents present in samples obtained from humans and animals, determining detailed genetic infor- 
mation about such pathogens or etiologic agents, and rapid detection and identification of bioagents from environmental, clinical or 
other samples. 
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METHODS FOR RAPID IDENTIFICATION OF PATHOGENS 
IN HUMANS AND ANIMALS 

FIELD OF THE INVENTION 

5 The present invention relates generally to clinical applications of directed to the 

identification of pathogens in biological samples from humans and animals. The present 
invention is also directed to the resolution of a plurality of etiologic agents present in samples 
obtained from humans and animals. The invention is further directed to the determination of 
detailed genetic information about such pathogens or etiologic agents. 

10 The identification of the bioagent is important for determining a proper course of 

treatment and/or eradication of the bioagent in such cases as biological warfare and natural 
infections. Furthermore, the determination of the geographic origin of a selected bioagent will 
facilitate the identification of potential criminal identity. The present invention also relates to 
methods for rapid detection and identification of bioagents from environmental, clinical or other 

15 samples. The methods provide for detection and characterization of a unique base composition 
signature (BCS) from any bioagent, including bacteria and viruses. The unique BCS is used to 
rapidly identify the bioagent 

BACKGROUND OF THE INVENTION 

20 In the United States, hospitals report well over 5 million cases of recognized infectious 

disease-related illnesses annually. Significantly greater numbers remain undetected, both in the 
inpatient and community setting, resulting in substantial morbidity and mortality. Critical 
intervention for infectious disease relies on rapid, sensitive and specific detection of the 
offending pathogen, and is central to die mission of microbiology laboratories at medical centers. 

25 Unfortunately, despite the recognition that outcomes from infectious illnesses are directly 
associated with time to pathogen recognition, as well as accurate identification of the class and 
species of microbe, and ability to identify the presence of drug resistance isolates, conventional 
hospital laboratories often remain encumbered by traditional slow multi-step culture based 
assays. Other limitations of the conventional laboratory which have become increasingly 

30 apparent include: extremely prolonged wait-times for pathogens with long generation time (up to 
several weeks); requirements for additional testing and wait times for speciation and ^ 
identification of antimicrobial resistance; diminished test sensitivity for patients who have 
received antibiotics; and absolute inability to culture certain pathogens in disease states 
associated with microbial infection. 
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For more than a decade, molecular testing has been heralded as the diagnostic tool for 
the new mUlennium, whose ultimate potential could include forced obsolescence of traditional 
hospital laboratories. However, despite the fact that significant advances in clinical application of 
PCR techniques have occurred, the practicing physician still relies principally on standard 
5 techniques. A brief discussion of several existing applications of PCR in the hospital-based 
setting follows. 

Generally speaking molecular diagnostics have been championed for identifying 
organisms that cannot be grown in vitro, or in instances where existing culture techniques are 
insensitive and/or require prolonged incubation times. PCR-based diagnostics have been 

10 successfully developed for a wide variety of microbes. Application to the clinical arena has met 
with variable success, with only a few assays achieving acceptance and utility. 

One of the earliest, and perhaps most widely recognized applications of PCR for clinical 
practice is in detection of Mycobacterium tuberculosis. Clinical characteristics favoring 
development of a nonculture-based test for tuberculosis include week to month long delays 

1 5 associated with standard testing, occurrence of drug-resistant isolates and public health 

imperatives associated with recognition, isolation and treatment Although frequently used as a 
diagnostic adjunctive, practical and routine clinical application of PCR remains problematic due 
to significant inter-laboratory variation in sensitivity, and inadequate specificity for use in low 
prevalence populations, requiring further development at the technical level. Recent advances in 

20 the laboratory suggest that identification of drug resistant isolates by amplification of mutations 
associated with specific antibiotic resistance (e.g., rpoB gene in rifampin resistant strains) may 
be forthcoming for clinical use, although widespread application will require extensive clinical 
validation. 

One diagnostic assay, which has gained widespread acceptance, is for C. trachomatis. 

25 Conventional detection systems are limiting due to inadequate sensitivity and specificity (direct 
immunofluorescence or enzyme immunoassay) or the requirement for specialized culture 
facilities, due to the fastidious characteristics of this microbe. Laboratory development, followed 
by widespread clinical validation testing in a variety of acute and nonacute care settings have 
demonstrated excellent sensitivity (90-100%) and specificity (97%) of the PCR assay leading to 

30 its commercial development. Proven efficacy of the PCR assay from both genital and urine 
sampling, have resulted in its application to a variety of clinical setting, most recently including 
routine screening of patients considered at risk. 

While the full potential for PCR diagnostics to provide rapid and critical information to 
physicians faced with difficult clinical-decisions has yet to be realized, one recently developed 
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assay provides an example of the promise of this evolving technology. Distinguishing life- 
threatening causes of fever from more benign causes in children is a fundamental clinical 
dilemma faced by clinicians, particularly when infections of the central nervous system are being 
considered. Bacterial causes of meningitis can be highly aggressive, but generally cannot be 
5 differentiated on a clinical basis from aseptic meningitis, which is a relatively benign condition 
that can be managed on an outpatient basis. Existing blood culture methods often take several 
days to turn positive, and are often confounded by poor sensitivity or false-negative findings in 
patients receiving empiric antimicrobials. Testing and application of a PCR assay for enteroviral 
meningitis has been found to be highly sensitive. With reporting of results within 1 day, 

10 preliminary clinical trials have shown significant reductions in hospital costs, due to decreased 
duration of hospital stays and reduction in antibiotic therapy. Other viral PCR assays, now 
routinely available include those for herpes simplex virus, cytomegalovirus, hepatitis and HIV. 
Each has a demonstrated cost savings role in clinical practice, including detection of otherwise 
difficult to diagnose infections and newly realized capacity to monitor progression of disease and 

15 response to therapy, vital in the management of chronic infectious diseases. 

The concept of a universal detection system has been forwarded for identification of 
bacterial pathogens, and speaks most directly to the possible clinical implications of a broad- 
based screening tool for clinical use. Exploiting the existence of highly conserved regions of 
DNA common to all bacterial species in a PCR assay would empower physicians to rapidly 

20 identity the presence of bacteremia, which would profoundly impact patient care. Previous 
empiric decision making could be abandoned in favor of educated practice, allowing appropriate 
and expeditious decision-making regarding need for antibiotic therapy and hospitalization. 

Experimental work using the conserved features of the 16S rRNA common to almost all 
bacterial species, is an area of active investigation. Hospital test sites have focused on "high 

25 yield" clinical settings where expeditious identification of the presence of systemic bacterial 
infection has. immediate high morbidity and mortality consequences. Notable clinical infections 
have included evaluation of febrile infants at risk for sepsis, detection of bacteremia in febrile 
neutropenic cancer patients, and examination of critically ill patients in the intensive care unit. 
While several of these studies have reported promising results (with sensitivity and specificity 

30 well over 90%), significant technical difficulties (described below) remain, and have prevented 
general acceptance of this assay ' in clinics and fiospitals (which remain dependent oh standard" 
blood culture methodologies). Even the revolutionary advances of real-time PCR technique, 
which offers a quantitative more reproducible and technically simpler system, remains 
encumbered by inherent technical limitations of the PCR assay. 
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The principle shortcomings of applying PCR assays to the clinical setting include: 
inability to eliminate background DNA contamination; interference with the PCR amplification 
by substrates present in the reaction; and limited capacity to provide rapid reliable speciation, 
antibiotic resistance and subtype identification. Some laboratories have recently made progress 

5 in identifying and removing inhibitors; however background contamination remains problematic, 
and methods directed towards eliminating exogenous sources of DNA report significant 
diminution in assay sensitivity. Finally, while product identification and detailed characterization 
has been achieved using sequencing techniques, these approaches are laborious and time- 
intensive thus detracting from its clinical applicability. 

0 Rapid and definitive microbial identification is desirable for a variety of industrial, 

medical, environmental, quality, and research reasons. Traditionally, the microbiology laboratory 
has functioned to identify the etiologic agents of infectious diseases through direct examination 
and culture of specimens. Since the mid-1980s, researchers have repeatedly demonstrated the 
practical utility of molecular biology techniques, many of which form the basis of clinical 

5 diagnostic assays. Some of these techniques include nucleic acid hybridization analysis, 
restriction enzyme analysis, genetic sequence analysis, and separation and purification of nucleic 
acids (See, e.g., J. Sambrook, E. F. Fritsch, and T. Maniatis, Molecular Cloning: A Laboratory 
Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989). These 
procedures, in general, are time-corisuming and tedious. Another option is the polymerase chain 

0 reaction (PCR) or other amplification procedure that amplifies a specific target DNA sequence 
based on the flanking primers used. Finally, detection and data analysis convert the hybridization 
event into an analytical result. 

Other not yet fully realized applications of PCR for clinical medicine is the 
identification of infectious causes of disease previously described as idiopathic (e.g. Bartonella 

5 henselae in bacillary angiomatosis, and Tropheryma whippellii as the uncultured bacillus 
associated with Whipple's disease). Further, recent epidemiological studies which suggest a 
strong association between Chlamydia pneumonia and coronary artery disease, serve as example 
of the possible widespread, yet undiscovered links between pathogen and host which may 
ultimately allow for new insights into pathogenesis and novel life sustaining or saving 

3 therapeutics. 

For the practicing clinician, PCR technology offers a yet unrealized potential for 
diagnostic omnipotence in the arena of infectious disease. A universal reliable infectious disease 
detection system would certainly become a fundamental tool in the evolving diagnostic 
armamentarium of the 21 st century clinician. For front line emergency physicians, or physicians 
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working in disaster settings, a quick universal detection system, would allow for molecular triage 
and early aggressive targeted therapy. Preliminary clinical studies using species specific probes 
suggest that implementing rapid testing in acute care setting is feasible. Resources could thus be 
appropriately applied, and patients with suspected infections could rapidly be risk stratified to 
5 the different treatment settings, depending on the pathogen and virulence. Furthermore, links 
with data management systems, locally regionally and nationally, would allow for effective 
epidemiological surveillance, with obvious benefits for antibiotic selection and control of disease 
outbreaks. 

For the hospitalists, the ability to speciate and subtype would allow for more precise 
1 0 decision-making regarding antimicrobial agents. Patients who are colonized with highly 
contagious pathogens could be appropriately isolated on entry into the medical setting without 
delay. Targeted therapy will diminish development of antibiotic resistance. Furthermore, 
identification of the genetic basis of antibiotic resistant strains would permit precise 
pharmacologic intervention. Both physician and patient would benefit with less need for 
15 repetitive testing and elimination of wait times for test results. 

It is certain that the individual patient will benefit directly from this approach. Patients 
with unrecognized or difficult to diagnose infections would be identified and treated promptly. 
There will be reduced need for prolonged inpatient stays, with resultant decreases in iatrogenic 
events. 

20 Mass spectrometry provides detailed information about the molecules being analyzed, 

including high mass accuracy. It is also a process that can be easily automated. Low-resolution 
MS may be unreliable when used to detect some known agents, if their spectral lines are 
sufficiently weak or sufficiently close to those from other living organisms in the sample. DNA 
chips with specific probes can only determine the presence or absence of specifically anticipated 

25 organisms. Because there are hundreds of thousands of species of benign bacteria, some very 
similar in sequence to threat organisms, even arrays with 10,000 probes lack the breadth needed 
to detect a particular organism. 

Antibodies face more severe diversity limitations than arrays. If antibodies are designed 
against highly conserved targets to increase diversity, the false alarm problem will dominate, 

30 again because threat organisms are very similar to benign ones. Antibodies are only capable of 
detecting known agents in relatively uncluttered environments. 

Several groups have reported detection of PCR products using high resolution 
electrospray ionization-Fourier transform-ion cyclotron resonance mass spectrometry (ESI-FT- 
ICR MS). Accurate measurement of exact mass combined with knowledge of the number of at 
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least one nucleotide allowed calculation of the total base composition for PCR duplex products 
of approximately 100 base pairs. (Aaserud et al., J. Am. Soc. Mass Spec., 1996, 7, 1266-1269; 
Muddiman et al., Anal. Chem., 1997, 69, 1543-1549; Wunschel et al., Anal. Chem„ 1998, 70, 
1203-1207; Muddiman et al., Rev. Anal. Chem., 1998, 17, 1-68). Electrospray ionization-Fourier 
5 transform-ion cyclotron resistance (ESI-FT-ICR) MS may be used to determine the mass of 
double-stranded, 500 base-pair PCR products via the average molecular mass (Hurst et al., Rapid 
Comraun. Mass Spec. 1996, 10, 377-382). Use of matrix-assisted laser desorption ionization- 
time of flight (MALDI-TOF) mass spectrometry for characterization of PCR products has been 
described. (Muddiman et al., Rapid Commun. Mass Spec., 1999, 13, 1201-1204). However, the 
10 degradation of DNAs over about 75 nucleotides observed with MALDI limited the utility of this 
method. 

U.S. Patent No. 5,849,492 reports a method for retrieval of phylogenetically informative 
DNA sequences which comprise searching for a highly divergent segment of genomic DNA 
surrounded by two highly conserved segments, designing the universal primers for PCR 
15 amplification of the highly divergent region, amplifying the genomic DNA by PCR technique 
using universal primers, and then sequencing the gene to determine the identity of the organism. 

U.S. Patent No. 5,965,363 reports methods for screening nucleic acids for 
polymorphisms by analyzing amplified target nucleic acids using mass spectrometric techniques 
and to procedures for improving mass resolution and mass accuracy of these methods. 
20 WO 99/14375 reports methods, PCR primers and kits for use in analyzing preselected 

DNA tandem nucleotide repeat alleles by mass spectrometry. 

WO 98/12355 reports methods of determining the mass of a target nucleic acid by mass 
spectrometric analysis, by cleaving the target nucleic acid to reduce its length, making the target 
single-stranded and using MS to determine the mass of the single-stranded shortened target. Also 
25 reported are methods of preparing a double-stranded target nucleic acid for MS analysis 

comprising amplification of the target nucleic acid, binding one of the strands to a solid support, 
releasing the second strand and then releasing the first strand which is then analyzed by MS. Kits 
for target nucleic acid preparation are also provided. 

PCT WO97/33000 reports methods for detecting mutations in a target nucleic acid by 
30 nonrandomly fragmenting the target into a set of single-stranded nonrandom length fragments 
and determining their masses by "MS. ' - 

U.S. Patent No. 5,605,798 reports a fast and highly accurate mass spectrometer-based 
process for detecting the presence of a particular nucleic acid in a biological sample for 
diagnostic purposes. 
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WO 98/21066 reports processes for determining the sequence of a particular target 
nucleic acid by mass spectrometry. Processes for detecting a target nucleic acid present in a 
biological sample by PCR amplification and mass spectrometry detection are reported, as are 
methods for detecting a target nucleic acid in a sample by amplifying the target with primers that 
5 contain restriction sites and tags, extending and cleaving the amplified nucleic acid, and 
detecting the presence of extended product, wherein the presence of a DNA fragment of a mass 
different from wild-type is indicative of a mutation. Methods of sequencing a nucleic acid via 
mass spectrometry methods are also reported. 

WO 97/37041 , WO 99/3 1278 and U.S. Patent No. 5,547,835 report methods of 
10 sequencing nucleic acids using mass spectrometry. U.S. Patent Nos. 5,622,824, 5,872,003 and 
5,691,141 report methods, systems and kits for exonuclease-mediated mass spectrometric 
sequencing. 

Thus, there is a need for a method for bioagent detection and identification which is 
both specific and rapid, and in which no nucleic acid sequencing is required. The present 
15 invention addresses this need. 

SUMMARY OF THE INVENTION 

The present invention is directed towards methods of identifying a pathogen in a 
biological sample by obtaining nucleic acid from a biological sample, selecting at least one pair 

20 of intelligent primers with the capability of amplification of nucleic acid of the pathogen, 
amplifying the nucleic acid with the primers to obtain at least one amplification product, 
determining the molecular mass of at least one amplification product from which the pathogen is 
identified. Further, this invention is directed to methods of epidemic surveillance. By identifying 
a pathogen from samples acquired from a plurality of geographic locations, the spread of the 

25 pathogen to a given geographic location can be determined. 

The present invention is also directed to methods of diagnosis of a plurality of etiologic 
agents of disease in an individual by obtaining a biological sample from an individual, isolating 
nucleic acid from the biological sample, selecting a plurality of amphfication primers with the 
capability of amplification of nucleic acid of a plurality of etiologic agents of disease, amplifying 

30 the nucleic acid with a plurality of primers to obtain a plurality of amplification products 
corresponding to a plurality of etiologic agents, detenninm^the molecular masses of the" 
plurality of unique amplification products which identity the members of the plurality of 
etiologic agents. 
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The present invention is also directed to methods of in silico screening of primer sets to 
be used in identification of a plurality of bioagents by preparing a base composition probability 
cloud plot from a plurality of base composition signatures of the plurality of bioagents generated 
in silico, inspecting the base composition probability cloud plot for overlap of clouds from 
5 different bioagents, and choosing primer sets based on minimal overlap of the clouds. 

The present invention is also directed to methods of predicting the identity of a bioagent 
with a heretofore unknown base composition signature by preparing a base composition 
probability cloud plot from a plurality of base composition signatures of the plurality of 
bioagents which includes the heretofore unknown base composition, inspecting the base 

10 composition probability cloud for overlap of the heretofore unknown base composition with the 
cloud of a known bioagent such that overlap predicts that the identity of the bioagent with a 
heretofore unknown base composition signature equals the identity of the known bioagent. 

The present invention is also directed to methods for determining a subspecies 
characteristic for a given pathogen in a biological sample by identifying the pathogen in a 

15 biological sample using broad range survey primers or division-wide primers, selecting at least 
one pair of drill-down primers to amplify nucleic acid segments which provide a subspecies 
characteristic about the pathogen, amplifying the nucleic acid segments to produce at least one 
drill-down amplification product and determining the base composition signature of the drill- 
down amplification product wherein the base composition signature provides a subspecies 

20 characteristic about the pathogen. 

The present invention is also directed to methods of pharmacogenetic analysis by 
obtaining a sample of genomic DNA from an individual, selecting a segment of the genomic 
DNA which provides pharmacogenetic information, using at least one pair of intelligent primers 
to produce an amplification product which comprises the segment of genomic DNA and 

25 determining the base composition signature of the amplification product, wherein the base 
composition signature provides pharmacogenetic information about said individual. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1 A-1H and Figure 2 are consensus diagrams that show examples of conserved 
30 regions from 16S rRNA (Fig. 1A-1, 1A-2, 1A-3, 1A-4, and 1A-5), 23S rRNA (3'-half, Fig. IB, 
1C, and ID; 5'-half, Fig. 1E-F), 23S rRNA Domain I (Fig: 1G)* 23S rRNA Domain IV (Fig. 1H) 
and 16S rRNA Domain III (Fig. 2) which are suitable for use in the present invention. Lines with 
arrows are examples of regions to which intelligent primer pairs for PCR are designed. The label 
for each primer pair represents the starting and ending base number of the amplified region on 
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the consensus diagram. Bases in capital letters are greater than 95% conserved; bases in lower 
case letters are 90-95% conserved, filled circles are 80-90% conserved; and open circles are less 
than 80% conserved. The label for each primer pair represents the starting and ending base 
number of the amplified region on the consensus diagram. The nucleotide sequence of the 16S 
5 rRNA consensus sequence is SEQ ID NO:3 and the nucleotide sequence of the 23S rRNA 
consensus sequence is SEQ ID NO:4. 

Figure 2 shows a typical primer amplified region from the 16S rRNA Domain in shown 
in Figure 1A-1. 

Figure 3 is a schematic diagram showing conserved regions in RNase P. Bases in capital 
10 letters are greater than 90% conserved; bases in lower case letters are 80-90% conserved; filled 
circles designate bases which are 70-80% conserved; and open circles designate bases that are 
less than 70% conserved. 

Figure 4 is a schematic diagram of base composition signature determination using 
nucleotide analog 'tags" to determine base composition signatures. 
1 5 Figure 5 shows the deconvoluted mass spectra of a Bacillus anthracis region with and 

without the mass tag phosphorothioate A (A*). The two spectra differ in that the measured 
molecular weight of the mass tag-containing sequence is greater than the unmodified sequence. 

Figure 6 shows base composition signature (BCS) spectra from PCR products from 
Staphylococcus aureus (S. aureus 16S_1337F) and Bacillus anthracis (B. anthr. 16S_1337F), 
20 amplified using the same primers. The two strands differ by only two (AT->CG) substitutions 
and are clearly distinguished on the basis of their BCS. 

Figure 7 shows that a single difference between two sequences (A14 in B. anthracis vs. 
A15 in B. cereus) can be easily detected using ESI-TOF mass spectrometry. 

Figure 8 is an ESI-TOF of Bacillus anthracis spore coat protein sspE 56mer plus 
25 calibrant. The signals unambiguously identify B. anthracis versus other Bacillus species. 

Figure 9 is an ESI-TOF of a B. anthracis synthetic 16S_1228 duplex (reverse and 
forward strands). The technique easily distinguishes between the forward and reverse strands. 

Figure 10 is an ESI-FTICR-MS of a synthetic B. anthracis 16SJ337 46 base pah- 
duplex. 

30 Figure 1 1 is an ESI-TOF-MS of a 56mer oligonucleotide (3 scans) from the B. anthracis 

saspB gene with an internal mass standard. The internal mass standards are designated by 
asterisks. 
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Figure 12 is an ESI-TOF-MS of an internal standard with 5 mM TBA-TFA buffer 
showing that charge stripping with tributylanunonium trifluoroacetate reduces the most abundant 
charge state from [M-8H+]8- to [M-3H+J3-. 

Figure 13 is a portion of a secondary structure defining database according to one 
5 embodiment of the present invention, where two examples of selected sequences are displayed 
graphically thereunder. 

Figure 14 is a three dimensional graph demonstrating the grouping of sample molecular 
weight according to species. 

Figure 15 is a three dimensional graph demonstrating the grouping of sample molecular 
10 weights according to species of virus and mammal infected 

Figure 16 is a three dimensional graph demonstrating the grouping of sample molecular 
weights according to species of virus, and animal-origin of infectious agent 

Figure 17 is a figure depicting how a typical triangulation method of the present 
invention provides for the identification of an unknown bioagent without prior knowledge of the 
15 unknown agent. The use of different primer sets to distinguish and identify the unknown is also 
depicted as primer sets I, II and III within this figure. A three-dimensional graph depicts all of 
bioagent space (170), mcluding the unknown bioagent, which after use of primer set I (171) 
according to a method according to the present invention further differentiates and classifies 
bioagents according to major classifications (176) which, upon further analysis using primer set 
20 II (172) differentiates the unknown agent (177) from other, known agents (173) and finally, the 
use of a third primer set (175) further specifies subgroups within the family of the unknown 
(174). 

Figure 18 shows a representative base composition probability cloud for a region of the 
RNA polymerase B gene from a cluster of enterobacteria. The dark spheres represent the actual 

25 base composition of the organisms. The lighter spheres represent the transitions among base 
compositions observed in different isolates of the same species of organism. 

Figure 19 shows resolution of enterobacteriae members with primers targeting RNA 
polymerase B (rpoB). A single pair of primers targeting a hyper-variable region within rpoB was 
sufficient to resolve most members of this group at the genus level (Salmonella from Escherichia 

30 from Yersinia) as well as the species/strain level (E. coli K12 from 0157). All organisms with 
the exception of Y. pestis were tested in the lab and the measured base counts (shown with 
arrow) matched the predictions in every case. 
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Figure 20 shows detection of 5. aureus in blood. Spectra on the right indicate signals 
corresponding to S. aureus detection in spiked wells Al and A4 with no detection in control 
wells A2 and A3. 

Figure 21 shows a representative base composition distribution of human adenovirus 
5 strain types for a single primer pair region on the hexon gene. The circles represent different 
adenovirus sequences in our database that were used for primer design. Measurement of masses 
and base counts for each of the unknown samples A, B, C and D matched one or more of the 
known groups of adenoviruses. 

Figure 22 shows a representative broad range survey/drill-down process as applied to 
10 emm-typing of streptococcus pyogenes (Group A Streptococcus: GAS). Genetic material is 
extracted (201) and amplified using broad range survey primers (202). The amplification 
products are analyzed (203) to determine the presence and identity of bioagents at the species 
level. If Streptococcus pyogenes is detected (204), the emm-typing "drill-down" primers are 
used to reexamine the extract to identify the emm-type of the sample (205). Different sets of 
15 drill down primers can be employed to determine a subspecies characteristic for various strains 
of various bioagents (206). 

Figure 23 shows a representative base composition distribution of bioagents detected in 
throat swabs from military personnel using a broad range primer pair directed to 1 6S rKNA. 

Figure 24 shows a representative deconvoluted ESI-FTICR spectra of the PCR products 
20 produced by the gtr primer for samples 12 (top) and 10 (bottom) corresponding to emm types 3- 
and 6, respectively. Accurate mass measurements were obtained by using an internal mass 
standard and post-calibrating each spectrum; the experimental mass measurement uncertainty on 
each strand is + 0.035 Daltons (1 ppm). Unambiguous base compositions of the amplicons were 
determined by calculating all putative base compositions of each stand within the measured mass 
25 (and measured mass uncertainty) and selecting complementary pairs within the mass 
measurement uncertainty. In all cases there was only one base composition within 25 ppm. The 
measured mass difference of 15.985 Da between the strands shown on the left is in excellent 
agreement with the theoretical mass difference of 15.994 Da expected for an A to G substitution. 
Figure 25 shows representative results of the base composition analysis on throat swab 
30 samples using the six primer pairs, 5'-emm gene sequencing and the MLST gene sequencing 
"method of the present invention for an outbreak of Streptococcus pyogenes (group A 
streptococcus; GAS) at a military training camp. 

Figure 26 shows: a) a representative ESI-FTICR mass spectrum of a restriction digest of 
a 986 bp region of the 16S ribosomal gene from E. coli K12 digested with a mixture of BstNJ, 
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BsmFl, Bfal, and Ncol; b) a deconvolved representation (neutral mass) of the above spectrum 
showing the base compositions derived from accurate mass measurements of each fragment; and 
c) a representative reconstructed restriction map showing complete base composition coverage 
for nucleotides 1 -856. The Ncol did not cut . 
5 Figure 27 shows a representative base composition distribution of poxviruses for a 

single primer pair region on the DNA-dependent polymerase B gene (DdDpB). The spheres 
represent different poxvirus sequences that were used for primer design. 

DESCRIPTION OF EMBODIMENTS 

10 The present invention provides, inter alia, methods for detection and identification of 

bioagents in an unbiased manner using "bioagent identifying amplicons." "Intelligent primers" 
are selected to hybridize to conserved sequence regions of nucleic acids derived from a bioagent 
and which bracket variable sequence regions to yield a bioagent identifying amplicon which can 
be amplified and which is amenable to molecular mass determination. The molecular mass then 

15 provides a means to uniquely identify the bioagent without a requirement for prior knowledge of 
the possible identity of the bioagent The molecular mass or corresponding "base composition 
signature" (BCS) of the amplification product is then matched against a database of molecular 
masses or base composition signatures. Furthermore, the method can be applied to rapid parallel 
"multiplex" analyses, the results of which can be employed in a triangulation identification 

20 strategy. The present method provides rapid throughput and does not require nucleic acid 
sequencing of the amplified target sequence for bioagent detection and identification. 

In the context of this invention, a "bioagent" is any organism, cell, or virus, living or 
dead, or a nucleic acid derived from such an organism, cell or virus. Examples of bioagents 
include, but are not limited, to cells (including, but not limited to, human clinical samples, 

25 bacterial cells and other pathogens) viruses, fungi, and protists, parasites, and pathogenicity 
markers (including, but not limited to, pathogenicity islands, antibiotic resistance genes, 
virulence factors, toxin genes and other bioregulating compounds). Samples may be alive or 
dead or in a vegetative state (for example, vegetative bacteria or spores) and may be 
encapsulated or bioengineered. In the context of this invention, a "pathogen" is a bioagent that 

30 causes a disease or disorder. 

Despite eno'rmbus : biol6gical diversity, all forms of life on earth share sets of essential, 
common features in their genomes. Bacteria, for example have highly conserved sequences in a 
variety of locations on their genomes. Most notable is the universally conserved region of the 
ribosome, but there are also conserved elements in other non-coding RNAs, including RNAse P 
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and the signal recognition particle (SRP) among others. Bacteria have a common set of 
absolutely required genes. About 250 genes are present in all bacterial species (Mushegian et al., 
Proc. Natl. Acad. Sci. U.S.A., 1996, 93, 10268; and Fraser et al., Science, 1995, 270, 397), 
including tiny genomes like Mycoplasma, Ureaplasma and Rickettsia. These genes encode 
5 proteins involved in translation, replication, recombination and repair, transcription, nucleotide 
metabolism, amino acid metabolism, lipid metabolism, energy generation, uptake, secretion and 
the like. Examples of these proteins are DNA polymerase IE beta, elongation factor TU, heat 
shock protein groEL, KNA polymerase beta, phosphoglycerate kinase, NADH dehydrogenase, 
DNA ligase, DNA topoisomerase and elongation factor G. Operons can also be targeted using 

10 the present method. One example of an operon is the bfp operon from enteropathogenic E. coli. 
Multiple core chromosomal genes can be used to classify bacteria at a genus or genus species 
level to determine if an organism has threat potential. The methods can also be used to detect 
pathogenicity markers (plasmid or chromosomal) and antibiotic resistance genes to confirm the 
threat potential of an organism and to direct countermeasures. 

1 5 Since genetic data provide the underlying basis for identification of bioagents by the 

methods of the present invention, it is prudent to select segments of nucleic acids which ideally 
provide enough variability to distinguish each individual bioagent and whose molecular mass is 
amenable to molecular mass determination. In one embodiment of the present invention, at least 
one polynucleotide segment is amplified to facilitate detection and analysis in the process of 

20 identifying the bioagent. Thus, the nucleic acid segments that provide enough variability to 
distinguish each individual bioagent and whose molecular masses are amenable to molecular 
mass determination are herein described as "bioagent identifying amplicons." The term 
"amplicon" as used herein, refers to a segment of a polynucleotide which is amplified in an 
amplification reaction. In some embodiments of the present invention, bioagent identifying 

25 amplicons comprise from about 45 to about 1 50 nucleobases (i.e. from about 45 to about 1 50 
linked nucleosides). One of ordinary skill in the art will appreciate that the invention embodies 
compounds of 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 
93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 

30 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 
133, 134, 135, 136, 137, 138, 139, 140,141; 142, 143, 144, 145, 146, 147, 148, 149, and 150 
nucleobases in length. 

As used herein, "intelligent primers" are primers that are designed to bind to highly 
conserved sequence regions that flank an intervening variable region and yield amplification 
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products which ideally provide enough variability to distinguish each individual bioagent, and 
which are amenable to molecular mass analysis. By the term "highly conserved," it is meant that 
the sequence regions exhibit between about 80-100%, or between about 90-100%, or between 
about 95-100% identity. The molecular mass of a given amplification product provides a means 
5 of identifying the bioagent from which it was obtained, due to the variability of the variable 
region. Thus, design of intelligent primers involves selection of a variable region with 
appropriate variability to resolve the identity of a particular bioagent. It is the combination of the 
portion of the bioagent nucleic acid molecule sequence to which the intelligent primers hybridize 
and the intervening variable region that makes up the bioagent identifying amplicon. Alternately, 

10 it is the intervening variable region by itself that makes up the bioagent identifying amplicon. 
It is understood in the art that the sequence of a primer need not be 1 00% 
complementary to that of its target nucleic acid to be specifically hybridizable. Moreover, a 
primer may hybridize over one or more segments such that intervening or adjacent segments are 
not involved in the hybridization event (e.g., a loop structure or hairpin structure). The primers of 

15 the present invention can comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 
90%, at least 95%, or at least 99% sequence complementarity to the target region within the 
highly conserved region to which they are targeted. For example, an intelligent primer wherein 
18 of 20 nucleobases are complementary to a highly conserved region would represent 90 
percent complementarity to the highly conserved region. In this example, the remaining 

20 noncomplementary nucleobases may be clustered or interspersed with complementary 
nucleobases and need not be contiguous to each other or to complementary nucleobases. As 
such, a primer which is 18 nucleobases in length having 4 (four) noncomplementary nucleobases 
which are flanked by two regions of complete complementarity with the highly conserved region 
would have 77.8% overall complementarity with the highly conserved region and would thus fall 

25 within the scope of the present inventioa Percent complementarity of a primer with a region of a 
target nucleic acid can be determined routinely using BLAST programs (basic local alignment 
search tools) and PowerBLAST programs known in the art (Altschul et al., J. Mol. Biol., 1 990, 
215, 403-410; Zhang and Madden, Genome Res., 1997, 7, 649-656). 

Percent homology, sequence identity or complementarity, can be determined by, for 

30 example, the Gap program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics 
Computer Group, University Research Park, Madison WI), using default settings, which uses the 
algorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489). In some embodiments, 
complementarity of intelligent primers, is between about 70% and about 80%. In other 
embodiments, homology, sequence identity or complementarity, is between about 80% and 
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about 90%. In yet other embodiments, homology, sequence identity or complementarity, is 
about 90%, about 92%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99% 
or about 100%. 

The intelligent primers of this invention comprise from about 12 to about 35 
5 nucleobases (i.e. from about 12 to about 35 linked nucleosides). One of ordinary skill in the art 
will appreciate that the invention embodies compounds of 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 
22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 nucleobases in length. 

One having skill in the art armed with the preferred bioagent identifying amplicons 
defined by the primers illustrated herein will be able, without undue experimentation, to identify 

10 additional intelligent primers. 

In one embodiment, the bioagent identifying amplicon is a portion of a ribosomal RNA 
(rRNA) gene sequence. With the complete sequences of many of the smallest microbial genomes 
now available, it is possible to identify a set of genes that defines Minimal life" and identify 
composition signatures mat uniquely identify each gene and organism. Genes that encode core 

15 life functions such as DNA replication, transcription, ribosome structure, translation, and 

transport are distributed broadly in the bacterial genome and are suitable regions for selection of 
bioagent identifying amplicons. Ribosomal RNA (rRNA) genes comprise regions that provide 
useful base composition signatures. Like many genes involved in core life functions, rRNA 
genes contain sequences that are extraordinarily conserved across bacterial domains interspersed 

20 with regions of high variability that are more specific to each species. The variable regions can 
be utilized to build a database of base composition signatures. The strategy involves creating a 
structure-based alignment of sequences of the small (16S) and the large (23S) subunits of the 
rRNA genes. For example, there are currently over 13,000 sequences in the ribosomal RNA 
database that has been created and maintained by Robin Gutell, University of Texas at Austin, 

25 and is publicly available on the Institute for Cellular and Molecular Biology web page on the 
world wide web of the Internet at, for example, "rna.icmb.utexas.edu/." There is also a publicly 
available rRNA database created and maintained by the University of Antwerp, Belgium on the 
world wide web of the Internet at, for example, "rrna.uia.ac.be." 

These databases have been analyzed to determine regions that are useful as bioagent 

30 identifying amplicons. The characteristics of such regions include: a) between about 80 and 
1 00%, or greater than about 95% identity among species of the particular bioagent of interest, of 
upstream and downstream nucleotide sequences which serve as sequence amplification primer 
sites; b) an intervening variable region which exhibits no greater than about 5% identity among 
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species; and c) a separation of between about 30 and 1000 nucleotides, or no more than about 
50-250 nucleotides, or no more than about 60-100 nucleotides, between the conserved regions. 

As a non-limiting example, for identification of Bacillus species, the conserved 
sequence regions of the chosen bioagent identifying amplicon must be highly conserved among 
5 all Bacillus species while the variable region of the bioagent identifying amplicon is sufficiently 
variable such that the molecular masses of the amplification products of all species of Bacillus 
are distinguishable. 

Bioagent identifying amplicons amenable to molecular mass determination are either of 
a length, size or mass compatible with the particular mode of molecular mass determination or 
1 0 compatible with a means of providing a predictable fragmentation pattern in order to obtain 
predictable fragments of a length compatible with the particular mode of molecular mass 
determination. Such means of providing a predictable fragmentation pattern of an amplification 
product include, but are not limited to, cleavage with restriction enzymes or cleavage primers, 
for example. 

1 5 Identification of bioagents can be accomplished at different levels using intelligent 

primers suited to resolution of each individual level of identification. "Broad range survey" 
intelligent primers are designed with the objective of identifying a bioagent as a member of a 
particular division of bioagents. A "bioagent division" is defined as group of bioagents above the 
species level and includes but is not limited to: orders, families, classes, clades, genera or other 

20 such groupings of bioagents above the species level. As a non-limiting example, members of the 
Bacillus/Clostridia group or gamma-proteobacteria group may be identified as such by 
employing broad range survey intelligent primers such as primers that target 16S or 23S 
ribosomal RNA. 

In some embodiments, broad range survey intelligent primers are capable of 
25 identification of bioagents at the species level. One main advantage of the detection methods of 
the present invention is that the broad range survey intelligent primers need not be specific for a 
particular bacterial species, or even genus, such as Bacillus or Streptomyces. Instead, the primers 
recognize highly conserved regions across hundreds of bacterial species including, but not 
limited to, the species described herein. Thus, the same broad range survey intelligent primer 
30 pair can be used to identify any desired bacterium because it will bind to the conserved regions 
that flank a variable region specific to a single species, Or common to several bacterial species, 
allowing unbiased nucleic acid amplification of the intervening sequence and determination of its 
molecular weight and base composition. For example, the 16S_971-1062, 16S_1228-1310 and 
16S_1 100-1 188 regions are 98-99% conserved in about 900 species of bacteria (16S=16S rRNA, 
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numbers indicate nucleotide position). In one embodiment of the present invention, primers used 
in the present method bind to one or more of these regions or portions thereof. 

Due to their overall conservation, the flanking rRNA primer sequences serve as good 
intelligent primer binding sites to amplify the nucleic acid region of interest for most, if not all, 
5 bacterial species. The intervening region between the sets of primers varies in length and/or 
composition, and thus provides a unique base composition signature. Examples of intelligent 
primers that amplify regions of the 16S and 23S rRNA are shown in Figures 1 A-1H. A typical 
primer amplified region in 16S rRNA is shown in Figure 2. The arrows represent primers that 
bind to highly conserved regions that flank a variable region in 16S rRNA domain III. The 

10 amplified region is the stem-loop structure under "1 100-1188." It is advantageous to design the 
broad range survey intelligent primers to miriimize the number of primers required for the 
analysis, and to allow detection of multiple members of a bioagent division using a single pair of 
primers. The advantage of using broad range survey intelligent primers is that once a bioagent is 
broadly identified, the process of further identification at species and sub-species levels is 

15 facilitated by directing the choice of additional intelligent primers. 

"Division- wide" intelligent primers are designed with an objective of identifying a 
bioagent at the species level. As a non-Jimiting example, a Bacillus anthracis, Bacillus cereus 
and Bacillus thuringiensis can be distinguished from each other using division-wide intelligent 
primers. Division-wide intelligent primers are not always required for identification at the 

20 species level because broad range survey intelligent primers may provide sufficient identification 
resolution to accomplishing this identification objective. 

"Drill-down" intelligent primers are designed with an objective of identifying a sub- 
species characteristic of a bioagent. A "sub-species characteristic" is defined as a property 
imparted to a bioagent at the sub-species level of identification as a result of the presence or 

25 absence of a particular segment of nucleic acid. Such sub-species characteristics include, but are 
not limited to, strains, sub-types, pathogenicity markers such as antibiotic resistance genes, 
pathogenicity islands, toxin genes and virulence factors. Identification of such sub-species 
characteristics is often critical for detennining proper clinical treatment of pathogen infections. 
Chemical.Modificatlons of Intelligent Primers 

30 Ideally, intelligent primer hybridization sites are highly conserved in order to facilitate 

the hybridization of the primer. In cases'where primer hybridization is less efficient due to lower 
levels of conservation of sequence, intelligent primers can be chemically modified to improve 
the efficiency of hybridization. 
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For example, because any variation (due to codon wobble in the 3 rd position) in these 
conserved regions among species is likely to occur in the third position of a DNA triplet, 
oligonucleotide primers can be designed such that the nucleotide corresponding to this position is 
a base which can bind to more than one nucleotide, referred to herein as a "universal base." For 
5 example, under this "wobble" pairing, inosine (I) binds to U, C or A; guanine (G) binds to U or 
C, and uridine (U) binds to U or C. Other examples of universal bases include nitroindoles such 
as 5-nitroindole or 3-nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14, 1001- 
1003), the degenerate nucleotides dP or dK (Hill et al.), an acyclic nucleoside analog containing 
5-nitroindazole (Van Aerschot et al., Nucleosides and Nucleotides, 1995, 14, 1053-1 056) or the 

10 purine analog H2-deoxy-p-D-ribofijranosyl>imidazole-4-carboxamide (Sala et al., Nucl. Acids 
Res., 1996, 24, 3302-3306). 

In another embodiment of the invention, to compensate for the somewhat weaker 
binding by the "wobble" base, the oligonucleotide primers are designed such that the first and 
second positions of each triplet are occupied by nucleotide analogs which bind with greater 

15 affinity than the unmodified nucleotide. Examples of these analogs include, but are not limited 
to, 2,6-diaminopurine which binds to thymine, propyne T which binds to adenine and propyne C 
and phenoxazines, including G-clamp, which binds to G. Propynylated pyrimidines are described 
in U.S. Patent Nos. 5,645,985, 5,830,653 and 5,484,908, each of which is commonly owned and 
incorporated herein by reference, in its entirety. Propynylated primers are claimed in U.S Serial 

20 No. 10/294,203 which is also commonly owned and incorporated herein by reference in entirety. 
Phenoxazines are described in U.S. Patent Nos. 5,502,177, 5,763,588, and 6,005,096, each of 
which is incorporated herein by reference in its entirety. G-clamps are described in U.S. Patent 
Nos. 6,007,992 and 6,028,183, each of which is incorporated herein by reference in its entirety. 

A theoretically ideal bioagent detector would identify, quantify, and report the complete 

25 nucleic acid sequence of every bioagent that reached the sensor. The complete sequence of the 
nucleic acid component of a pathogen would provide all relevant information about the threat, 
including its identity and the presence of drug-resistance or pathogenicity markers. This ideal has 
not yet been achieved. However, the present invention provides a straightforward strategy for 
obtaining information with the same practical value based on analysis of bioagent identifying 

30 amplicons by molecular mass determination. 

In some cases, a molecular mass of a given bioagent identifying amplicon alone does 
not provide enough resolution to unambiguously identify a given bioagent For example, the 
molecular mass of the bioagent identifying amplicon obtained using the intelligent primer pair 
"16S_971" would be 55622 Da for both£ coli and Salmonella typhimurlum. However, if 
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additional intelligent primers are employed to analyze additional bioagent identifying amplicons, 
a "triangulation identification" process is enabled. For example, the "16S_1 100" intelligent 
primer pair yields molecular masses of SS009 and 55005 Da for E. coli and Salmonella 
typhimurium, respectively. Furthermore, the "23S_85S" intelligent primer pair yields molecular 

5 masses of 42656 and 42698 Da for E. coli and Salmonella typhimurium, respectively. In this 
basic example, the second and third intelligent primer pairs provided the additional 
"fmgerprmting" capability or resolution to distinguish between the two bioagents. 

In another embodiment, the triangulation identification process is pursued by measuring 
signals from a plurality of bioagent identifying amplicons selected within multiple core genes. 

10 This process is used to reduce false negative and false positive signals, and enable reconstruction 
of the origin of hybrid or otherwise engineered bioagents. In this process, after identification of 
multiple core genes, alignments are created from nucleic acid sequence databases. The 
alignments are then analyzed for regions of conservation and variation, and bioagent identifying 
amplicons are selected to distinguish bioagents based on specific genomic differences. For 

15 example, identification of the three part toxin genes typical of B. anthracis (Bowen et al., J. 
Appl. Microbiol., 1999, 87, 270-278) in the absence of the expected signatures from the B. 
anthracis genome would suggest a genetic engineering event 

The triangulation identification process can be pursued by characterization of bioagent 
identifying amplicons in a massively parallel fashion using the polymerase chain reaction (PCR), 

20 such as multiplex PCR, and mass spectrometric (MS) methods. Sufficient quantities of nucleic 
acids should be present for detection of bioagents by MS. A wide variety of techniques for 
preparing large amounts of purified nucleic acids or fragments thereof are well known to those of 
skill in the art. PCR requires one or more pairs of oligonucleotide primers that bind to regions 
which flank the target sequence(s) to be amplified. These primers prime synthesis of a different 

25 strand of DNA with synthesis occurring in the direction of one primer towards the other primer. 
The primers, DNA to be amplified, a thermostable DNA polymerase (e.g. Tag polymerase), the 
four deoxynucleotide triphosphates, and a buffer are combined to initiate DNA synthesis. The 
solution is denatured by heating, then cooled to allow annealing of newly added primer, followed 
by another round of DNA synthesis. This process is typically repeated for about 30 cycles, 

30 resulting in amplification of the target sequence. 

Although the use of PCR is suitable, other nucleic acid amplification techniques may 
also be used, including ligase chain reaction (LCR) and strand displacement amphfication 
(SDA). The high-resolution MS technique allows separation of bioagent spectral lines from 
background spectral lines in highly cluttered environments. 
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In another embodiment, the detection scheme for the PCR products generated from the 
bioagent(s) incorporates at least three features. First, the technique simultaneously detects and 
differentiates multiple (generally about 6-10) PCR products. Second, the technique provides a 
molecular mass that uniquely identifies the bioagent from the possible primer sites. Finally, the 

5 detection technique is rapid, allowing multiple PCR reactions to be run in parallel. 

Mass spectrometry (MS)-based detection of PCR products provides a means for 
determination of BCS that has several advantages. MS is intrinsically a parallel detection scheme 
without the need for radioactive or fluorescent labels, since every amplification product is 
identified by its molecular mass. The current state of the art in mass spectrometry is such that 

10 less than femtomole quantities of material can be readily analyzed to afford information about 
the molecular contents of the sample. An accurate assessment of the molecular mass of the 
material can be quickly obtained, irrespective of whether the molecular weight of the sample is 
several hundred, or in excess of one hundred thousand atomic mass units (amu) or Daltons. 
Intact molecular ions can be generated from amplification products using one of a variety of 

15 ionization techniques to convert the sample to gas phase. These ionization methods include, but 
are not limited to, electrospray ionization (ES), matrix-assisted laser desorption ionization 
(MALDI) and fast atom bombardment (FAB). For example, MALDI of nucleic acids, along with 
examples of matrices for use in MALDI of nucleic acids, are described in WO 98/54751 
(Genetrace, Inc.). 

20 In some embodiments, large DNAs and RNAs, or large amplification products 

therefrom, can be digested with restriction endonucleases prior to ionization. Thus, for example, 
an amplification product that was 1 0 kDa could be digested with a series of restriction 
endonucleases to produce a panel of, for example, 100 Da fragments. Restriction endonucleases 
and their sites of action are well known to the skilled artisan. In this manner, mass spectrometry 

25 can be performed for the purposes of restriction mapping. 

Upon ionization, several peaks are observed from one sample due to the formation of 
ions with different charges. Averaging the multiple readings of molecular mass obtained from a 
single mass spectrum affords an estimate of molecular mass of the bioagent. Electrospray 
ionization mass spectrometry (ESI-MS) is particularly useful for very high molecular weight 

30 polymers such as proteins and nucleic acids having molecular weights greater than 10 kDa, since 
' it yields a distribution of multiply-charged molecules of the sample without causing a significant 
amount of fragmentation. 
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The mass detectors used in the methods of the present invention include, but are not 
limited to, Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), ion trap, 
quadrupole, magnetic sector, time of flight (TOF), Q-TOF, and triple quadrupole. 

In general, the mass specttometric techniques which can be used in the present 
5 invention include, but are not limited to, tandem mass spectrometry, infrared multiphoton 
dissociation and pyrolytic gas chromatography mass spectrometry (PGC-MS). In one 
embodiment of the invention, the bioagent detection system operates continually in bioagent 
detection mode using pyrolytic GC-MS without PCR for rapid detection of increases in biomass 
(for example, increases in fecal contamination of drinking water or of germ warfare agents). To 
10 achieve minimal latency, a continuous sample stream flows directly into the PGC-MS 

combustion chamber. When an increase in biomass is detected, a PCR process is automatically 
initiated Bioagent presence produces elevated levels of large molecular fragments from, for 
example, about 100-7,000 Da which are observed in the PGC-MS spectrum. The observed mass 
spectrum is compared to a threshold level and when levels of biomass are determined to exceed a 
15 predetermined threshold, the bioagent classification process described hereinabove (combining 
PCR and MS, such as FT-ICR MS) is initiated. Optionally, alarms or other processes (halting 
ventilation flow, physical isolation) are also initiated by this detected biomass level. 

The accurate measurement of molecular mass for large DN As is limited by the 
adduction of cations from the PCR reaction to each strand, resolution of the isotopic peaks from 
20 natural abundance 13 C and 15 N isotopes, and assignment of the charge state for any ion. The 
cations are removed by in-line dialysis using a flow-through chip that brings the solution 
containing the PCR products into contact with a solution containing ammonium acetate in the 
presence of an electric field gradient orthogonal to the flow. The latter two problems are 
addressed by operating with a resolving power of >100,000 and by incorporating isotopically 
25 depleted nucleotide triphosphates into the DNA. The resolving power of the instrument is also a 
consideration. At a resolving power of 10,000, the modeled signal from the [M-14H+] 14 " charge 
state of an 84mer PCR product is poorly characterized and assignment of the charge state or 
exact mass is impossible. At a resolving power of 33,000, the peaks from the individual isotopic 
components are visible. At a resolving power of 100,000, the isotopic peaks are resolved to the 
30 baseline and assignment of the charge state for the ion is straightforward. The [ I3 C, ,5 N]-depleted 
- triphosphates are obtained, for example, by growing microorganisms on depleted media and 
harvesting the nucleotides (Batey et al., Nucl. Acids Res., 1992, 20, 4515-4523). 

While mass measurements of intact nucleic acid regions are believed to be adequate to 
determine most bioagents, tandem mass spectrometry (MS") techniques may provide more 
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definitive information pertaining to molecular identity or sequence. Tandem MS involves the 
coupled use of two or more stages of mass analysis where both the separation and detection steps 
are based on mass spectrometry. The first stage is used to select an ion or component of a sample 
from which further structural information is to be obtained. The selected ion is then fragmented 

5 using, e.g., blackbody irradiation, infrared multiphoton dissociation, or collisional activation. For 
example, ions generated by electrospray ionization (EST) can be fragmented using IR 
multiphoton dissociation. This activation leads to dissociation of glycosidic bonds and the 
phosphate backbone, producing two series of fragment ions, called the w-series (having an intact 
3' terminus and a 5' phosphate following internal cleavage) and the o-Base series (having an 

10 intact 5' terminus and a 3' furan). 

The second stage of mass analysis is then used to detect and measure the mass of these 
resulting fragments of product ions. Such ion selection followed by fragmentation routines can 
be performed multiple times so as to essentially completely dissect the molecular sequence of a 
sample. 

15 If there are two or more targets of similar molecular mass, or if a single amplification 

reaction results in a product that has the same mass as two or more bioagent reference standards, 
they can be distinguished by using mass-modifying "tags." In this embodiment of the invention, 
a nucleotide analog or "tag" is incorporated during amplification (e.g., a 5-(trifluoromethyl) 
deoxythymidine triphosphate) which has a different molecular weight than the unmodified base 

20 so as to improve distinction of masses. Such tags are described in, for example, PCT 

WO97/33000, which is incorporated herein by reference in its entirety. This further limits the 
number of possible base compositions consistent with any mass. For example, 5- 
(trifluoromemyl)deoxythymidine triphosphate can be used in place of dTTP in a separate nucleic 
acid amplification reaction. Measurement of the mass shift between a conventional amplification 

25 product and the tagged product is used to quantitate the number of thymidine nucleotides in each 
of the single strands. Because the strands are complementary, the number of adenosine 
nucleotides in each strand is also determined. 

In another amplification reaction, the number of G and C residues in each strand is 
determined using, for example, the cytidine analog 5-methylcytosine (5-meC) or propyne C. The 

30 combination of the A/T reaction and G/C reaction, followed by molecular weight determination, 
provides a unique base composition, this method is summarized in Figure 4 and Table 1 . 
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The mass tag phosphorothioate A (A*) was used to distinguish a Bacillus anthracis 
cluster. The B. anthracis (Ai«G 9 Ci4T 9 ) had an average MW of 14072.26, and the B. anthracis 
5 (A|A*i 3 G 9 CuT 9 ) had an average molecular weight of 14281.1 1 and the phosphorothioate A had 
an average molecular weight of +16.06 as determined by ESI-TOF MS. The deconvoluted 
spectra are shown in Figure 5. 

In another example, assume the measured molecular masses of each strand are 
30,000.1 15Da and 31,000.1 15 Da respectively, and the measured number of dT and dA residues 
10 are (30,28) and (28,30). If the molecular mass is accurate to 100 ppm, there are 7 possible 
combinations of dG+dC possible for each strand. However, if the measured molecular mass is 
accurate to 10 ppm, there are only 2 combinations of dG+dC, and at 1 ppm accuracy mere is 
only one possible base composition for each strand. 

Signals from the mass spectrometer may be input to a maximum-likelihood detection 
1 5 and classification algorithm such as is widely used in radar signal processing. The detection 
processing uses matched filtering of BCS observed in mass-basecount space and allows for 
detection and subtraction of signatures from known, harmless organisms, and for detection of 
unknown bioagent threats. Comparison of newly observed bioagents to known bioagents is also 
possible, for estimation of threat levei, by comparing their BCS to those of known organisms and 
20 to known forms of pathogenicity enhancement, such as insertion of antibiotic resistance genes or 
toxin genes. 
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Processing may end with a Bayesian classifier using log likelihood ratios developed 
from the observed signals and average background levels. The program emphasizes performance 
predictions culminating in probability-of-detection versus probability-of-felse-alarm plots for 
conditions involving complex backgrounds of naturally occurring organisms and environmental 
5 contaminants. Matched filters consist of a priori expectations of signal values given the set of 
primers used for each of the bioagents. A genomic sequence database (e.g. GenBank) is used to 
define the mass basecount matched filters. The database contains known threat agents and benign 
background organisms. The latter is used to estimate and subtract the signature produced by the 
background organisms. A maximum likelihood detection of known background organisms is 
10 implemented using matched filters and a running-sum estimate of the noise covariance. 
Background signal strengths are estimated and used along with the matched filters to form 
signatures mat are then subtracted. The maximum likelihood process is applied to this "cleaned 
up" data in a similar manner employing matched filters for the organisms and a running-sum 
estimate of the noise-covariance for the cleaned up data. 
1 5 Although the molecular mass of amplification products obtained using intelligent 

primers provides a means for identification of bioagents, conversion of molecular mass data to a 
base composition signature is useful for certain analyses. As used herein, a "base composition 
signature" (BCS) is the exact base composition determined from the molecular mass of a 
bioagent identifying amplicon. In one embodiment, a BCS provides an index of a specific gene 
20 in a specific organism. 

Base compositions, like sequences, vary slightly from isolate to isolate within species. It 
is possible to manage this diversity by building "base composition probability clouds" around the 
composition constraints for each species. This permits identification of organisms in a fashion 
similar to sequence analysis. A "pseudo four-dimensional plot" can be used to visualize the 
25 concept of base composition probability clouds (Figure 1 8). Optimal primer design requires 
optimal choice of bioagent identifying amplicons and maximizes the separation between the base 
composition signatures of individual bioagents. Areas where clouds overlap indicate regions that 
may result in a misclassification, a problem which is overcome by selecting primers that provide 
information from different bioagent identifying amplicons, ideally maximizing the separation of 
30 base compositions. Thus, one aspect of the utility of an analysis of base composition probability 
clouds is that it provides a means for screening primer sets in order to avoid potential 
misclassifications of BCS and bioagent identity. Another aspect of the utility of base 
composition probability clouds is that they provide a means for predicting the identity of a 
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bioagent whose exact measured BCS was not previously observed and/or indexed in a BCS 
database due to evolutionary transitions in its nucleic acid sequence. 

It is important to note that, in contrast to probe-based techniques, mass spectrometry 
determination of base composition does not require prior knowledge of the composition in order 
5 to make the measurement, only to interpret the results. In this regard, the present invention 
provides bioagent classifying information similar to DNA sequencing and phylogenetic analysis 
at a level sufficient to detect and identify a given bioagent. Furthermore, the process of 
determination of a previously unknown BCS for a given bioagent (for example, In a case where 
sequence information is unavailable) has downstream utility by providing additional bioagent 
10 indexing information with which to populate BCS databases. The process of future bioagent 
identification is thus greatly improved as more BCS indexes become available in the BCS 
databases. 

Another embodiment of the present invention is a method of surveying bioagent 
samples that enables detection and identification of all bacteria for which sequence information 

15 is available using a set of twelve broad-range intelligent PCR primers. Six of the twelve primers 
are "broad range survey primers" herein defined as primers targeted to broad divisions of 
bacteria (for example, the Bacillus/Clostridia group or gamma-proteobacteria). The other six 
primers of the group of twelve primers are "division-wide" primers herein defined as primers 
that provide more focused coverage and higher resolution. This method enables identification of 

20 nearly 100% of known bacteria at the species level. A further example of this embodiment of the 
present invention is a method herein designated "survey/drill-down" wherein a subspecies 
characteristic for detected bioagents is obtained using additional primers. Examples of such a 
subspecies characteristic include but are not limited to: antibiotic resistance, pathogenicity 
island, virulence factor, strain type, sub-species type, and clade group. Using the survey/drill- 

25 down method, bioagent detection, confirmation and a subspecies characteristic can be provided 
within hours. Moreover, the survey/drill-down method can be focused to identify bioengineering 
events such as the insertion of a toxin gene into a bacterial species that does not normally make 
the toxin. 

The present methods allow extremely rapid and accurate detection and identification of 
30 bioagents compared to existing methods. Furthermore, this rapid detection and identification is 
possible even when sample material is" impure. The methods leverage ongoing biomedical - 
research in virulence, pathogenicity, drug resistance and genome sequencing into a method 
which provides greatly improved sensitivity, specificity and reliability compared to existing 
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methods, with lower rates of false positives. Thus, the methods are useful in a wide variety of 
fields, including, but not limited to, those fields discussed below. 

In other embodiments of the invention, the methods disclosed herein can identify 
infectious agents in biological samples. At least a first biological sample containing at least a 
5 first unidentified infectious agent is obtained. An identification analysis is carried out on the 
sample, whereby the first infectious agent in the first biological sample is identified. More 
particularly, a method of identifying an infectious agent in a biological entity is provided. An 
identification analysis is carried out on a first biological sample obtained from the biological 
entity, whereby at least one infectious agent in the biological sample from the biological entity is 

1 0 identified. The obtaining and the performing steps are, optionally, repeated on at least one 
additional biological sample from the biological entity. 

The present invention also provides methods of identifying an infectious agent that is 
potentially the cause of a health condition in a biological entity. An identification analysis is 
carried out on a first test sample from a first infectious agent differentiating area of the biological 

1 5 entity, whereby at least one infectious agent is identified. The obtaining and the performing steps 
are, optionally, repeated on an additional infectious agent differentiating area of the biological 
entity. 

Biological samples include, but are not limited to, hair, mucosa, skin, nail, blood, saliva, 
rectal, lung, stool, urine, breath, nasal, ocular sample, or the like. In some embodiments, one or 

20 more biological samples are analyzed by the methods described herein. The biological sample(s) 
contain at least a first unidentified infectious agent and may contain more than one infectious 
agent. The biological sample(s) are obtained from a biological entity. The biological sample can 
be obtained by a variety of manners such as by biopsy, swabbing, and the like. The biological 
samples may be obtained by a physician in a hospital or other health care environment. The 

25 physician may then perform the identification analysis or send the biological sample to a 
laboratory to carry out the analysis. 

Biological entities include, but are not limited to, a mammal, a bird, or a reptile. The 
biological entity may be a cow, horse, dog, cat, or a primate. The biological entity can also be a 
human. The biological entity may be living or dead. 

30 An infectious agent differentiating area is any area or location within a biological entity 

that can distinguish between a harmful versus normal health condition. An infectious agent 
differentiating area can be a region or area of the biological entity whereby an infectious agent is 
more likely to predominate from another region or area of the biological entity. For example, 
infectious agent differentiating areas may include the blood vessels of the heart (heart disease, 
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coronary artery disease, etc.). particular portions of the digestive system (ulcers, Crohn's disease, 
etc.), liver (hepatitis infections), and the like. In some embodiments, one or more biological 
samples from a plurality of infectious agent differentiating areas is analyzed the methods 
described herein. 

5 Infectious agents of the invention may potentially cause a health condition in a 

biological entity. Health conditions include any condition, syndrome, illness, disease, or the like, 
identified currently or in the future by medical personnel. Infectious agents include, but are not 
limited to, bacteria, viruses, parasites, fungi, and the like. 

In other embodiments of the invention, the methods disclosed herein can be used to 

1 0 screen blood and other bodily fluids and tissues for pathogenic and non-pathogenic bacteria, 
viruses, parasites, fungi and the like. Animal samples, including but not limited to, blood and 
other bodily fluid and tissue samples, can be obtained from living animals, -who are either known 
or not known to or suspected of having a disease, infection, or condition. Alternately, animal 
samples such as blood and other bodily fluid and tissue samples can be obtained from deceased 

15 animals. Blood samples can be further separated into plasma or cellular fractions and further 
screened as desired. Bodily fluids and tissues can be obtained from any part of the animal or 
human body. Animal samples can be obtained from, for example, mammals and humans. 

Clinical samples are analyzed for disease causing bioagents and biowarfare pathogens 
simultaneously with detection of bioagents at levels as low as 100-1000 genomic copies in 

20 complex backgrounds with throughput of approximately 100-300 samples with simultaneous 
detection of bacteria and viruses. Such analyses provide additional value in probing bioagent 
genomes for unanticipated modifications. These analyses are carried out in reference labs, 
hospitals and the LRN laboratories of the public health system in a coordinated fashion, with the 
ability to report the results via a computer network to a common data-monitoring center in real 

25 time. Clonal propagation of specific infectious agents, as occurs in the epidemic outbreak of 
infectious disease, can be tracked with base composition signatures, analogous to the pulse field 
gel electrophoresis fingerprinting patterns used in tracking the spread of specific food pathogens 
in the Pulse Net system of the CDC (Swaminathan et al., Emerging Infectious Diseases, 2001, 7, 
382-389). The present invention provides a digital barcode in the form of a series of base 

30 composition signatures, the combination of which is unique for each known organism. This 
capability enables real-time infectious disease monitoring across broad geographic locations, 
which may be essential in a simultaneous outbreak or attack in different cities. 

In other embodiments of the invention, the methods disclosed herein can be used for 
detecting the presence of pathogenic and non-pathogenic bacteria, viruses, parasites, fungi and 
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the like in organ donors and/or in organs from donors. Such examination can result in the 
prevention of the transfer of, for example, viruses such as West Nile virus, hepatitis viruses, 
human immunodeficiency virus, and the like from a donor to a recipient via a transplanted organ. 
The methods disclosed herein can also be used for detection of host versus graft or graft versus 
5 host rejection issues related to organ donors by detecting the presence of particular antigens in 
either the graft or host known or suspected of causing such rejection. In particular, the bioagents 
in this regard are the antigens of the major histocompatibility complex, such as the HLA 
antigens. The present methods can also be used to detect and track emerging infectious diseases, 
such as West Nile virus infection, HIV-related diseases. 

10 In other embodiments of the invention, the methods disclosed herein can be used for 

pharmacogenetic analysis and medical diagnosis including, but not limited to, cancer diagnosis 
based on mutations and polymorphisms, drug resistance and susceptibility testing, screening for 
and/or diagnosis of genetic diseases and conditions, and diagnosis of infectious diseases and 
conditions. In context of the present invention, pharmacogenetics is defined as the study of 

1 5 variability in drug response due to genetic factors. Pharmacogenetic investigations are often 
based on correlating patient outcome with variations in genes involved in the mode of action of a 
given drug. For example, receptor genes, or genes involved in metabolic pathways. The methods 
of the present invention provide a means to analyze the DNA of a patient to provide the basis for 
pharmacogenetic analysis. 

20 The present method can also be used to detect single nucleotide polymorphisms (SNPs), 

or multiple nucleotide polymorphisms, rapidly and accurately. A SNP is defined as a single base 
pair site in the genome that is different from one individual to another. The difference can be 
expressed either as a deletion, an insertion or a substitution, and is frequently linked to a disease 
state. Because they occur every 100-1000 base pairs, SNPs are the most frequently bound type of 

25 genetic marker in the human genome. 

For example, sickle cell anemia results from an A-T transition, which encodes a valine 
rather than a glutamic acid residue. Oligonucleotide primers may be designed such that they bind 
to sequences that flank a SNP site, followed by nucleotide amplification and mass determination 
of the amplified product. Because the molecular masses of the resulting product from an 

30 individual who does not have sickle cell anemia is different from that of the product from an 
individual who has the disease, the "method can be used to distinguish the two individuals. Thus, 
the method can be used to detect any known SNP in an individual and thus diagnose or 
determine increased susceptibility to a disease or condition. 
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In one embodiment, blood is drawn from an individual and peripheral blood 
mononuclear cells (PBMC) are isolated and simultaneously tested, such as in a high-throughput 
screening method, for one or more SNPs using appropriate primers based on the known 
sequences which flank the SNP region. The National Center for Biotechnology Information 
5 maintains a publicly available database of SNPs on the world wide web of the Internet at, for 
example, "ncbi.nlm.nih,gov/SNP/. M 

The method of the present invention can also be used for blood typing. The gene 
encoding A, B or O blood type can differ by four single nucleotide polymorphisms. If the gene 
contains the sequence CGTGOTGACCCTT (SEQ ID NO:5), antigen A results. If the gene 

10 contains the sequence CGTCGTCACCGCTA (SEQ ID NO:6) antigen B results. If the gene 
contains the sequence CGTGGT-ACCCCTT (SEQ ID NO:7), blood group O results ("-" 
indicates a deletion). These sequences can be distinguished by designing a single primer pair 
which flanks these regions, followed by amplification and mass determination. 

The method of the present invention can also be used for detection and identification of 

1 5 blood-borne pathogens such as Staphylococcus aureus for example. 

The method of the present invention can also be used for strain typing of respiratory pathogens 
in epidemic surveillance. Group A streptococci (GAS), or Streptococcus pyogenes, is one of the 
most consequential causes of respiratory infections because of prevalence and ability to cause 
disease with complications such as acute rheumatic fever and acute glomerulonephritis . GAS 

20 also causes infections of the skin (impetigo) and, in rare cases, invasive disease such as 

necrotizing fasciitis and toxic shock syndrome. Despite many decades of study, the underlying 
microbial ecology and natural selection that favors enhanced virulence and explosive GAS 
outbreaks is still poorly understood. The ability to detect GAS and multiple other pathogenic 
and non-pathogenic bacteria and viruses in patient samples would greatly facilitate our 

25 understanding of GAS epidemics. It is also essential to be able to follow the spread of virulent 
strains of GAS in populations and to distinguish virulent strains from less virulent or avirulent 
streptococci that colonize the nose and throat of asymptomatic individuals at a frequency ranging 
from 5-20% of the population (Bisno, A. L. (1995) in Principles and Practice of Infectious 
Diseases, eds. MandelL G. L., Bennett, J. E. & Dolin, R. (Churchill Livingston, New York), Vol. 

30 2, pp. 1786-1799). Molecular methods have been developed to type GAS based upon the 
sequence of the emm gene" that encodes the M-prbtein virulence ractbr (Beall et al., J. Clin. 
Micro., 1996, 34, 953-958; Beall etal., J. Clin. Micro., 1997, 35, 1231-1235; and Facklam et al., 
Emerging Infectious Diseases, 1999, 5, 247-253). Using this molecular classification, over 150 
different emm-types are defined and correlated wilh phenotypic properties of thousands of GAS 
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isolates (www.cdc.gov/ncidod/biotech/ strep/strepindex.html) (Facklam et al., Clinical Infectious 
Diseases, 2002, 34, 28-38). Recently, a strategy known as Multi Locus Sequence Typing 
(MLST) was developed to follow the molecular Epidemiology of GAS. In MLST, internal 
fragments of seven housekeeping genes are amplified, sequenced, and compared to a database of 
5 previously studied isolates (www.test.mlst.net/). 

The present invention enables an emm-typing process to be carried out directly from 
throat swabs for a large number of samples within 12 hours, allowing strain tracking of an 
ongoing epidemic, even if geographically dispersed, on a larger scale than ever before 
achievable. 

10 In another embodiment, the present invention can be employed in the serotyping of 

viruses including, but not limited to, adenoviruses. Adenoviruses are DNA viruses that cause 
over 50% of febrile respiratory illnesses in military recruits. Human adenoviruses are divided 
into six major serogroups (A through F), each containing multiple strain types. Despite the 
prevalence of adenoviruses, there are no rapid methods for detecting and serotyping 

15 adenoviruses. 

In another embodiment, the present invention can be employed in distinguishing 
between members of the Orthopoxvirus genus. Smallpox is caused by the Variola virus. Other 
members of the genus include Vaccinia, Monkeypox, Camelpox, and Cowpox. All are capable of 
infecting humans, thus, a method capable of identifying and distinguishing among members of 

20 the Orthopox genus is a wormwhile objective. 

. In another embodiment, the present invention can be employed in distinguishing 
between viral agents of viral hemorrhagic fevers (VHF). VHF agents include, but are not limited 
to, Filoviridae (Marburg virus and Ebola vims),Arenavirtdae (Lassa, Junin, Machupo, Sabia, 
and Guanarito viruses), Bunyaviridae (Crimean-Congo hemorrhagic fever virus (CCHFV), Rift 

25 Valley fever virus, and Hanta viruses), and Flaviviridae (yellow fever virus and dengue virus). 
Infections by VHF viruses are associated with a wide spectrum of clinical manifestations such as 
diarrhea, myalgia, cough, headache, pneumonia, encephalopathy, and hepatitis. Filoviruses, 
arenaviruses, and CCHFV are of particular relevance because they can be transmitted from 
human to human, thus causing epidemics with high mortality rates (Khan et al., Am. J. Trop. 

30 Med. Hyg., 1997, 57, 519-525). In the absence of bleeding or organ manifestation, VHF is 
clinically difficult to diagnose, and the various etiologic agents can hardly be distinguished by 
clinical tests. Current approaches to PCR detection of these agents are time-consuming, as they 
include a separate cDNA synthesis step prior to PCR, agarose gel analysis of PCR products, and 
in some instances a second round of nested amplification or Southern hybridization. PCRs for 
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different pathogens have to be run assay by assay due to differences in cycling conditions, which 
complicate broad-range testing in a short period. Moreover, post-PCR processing or nested PCR 
steps included in currently used assays increase the risk of false positive results due to carryover 
contamination (Kwok et al., Nature, 1 989, 339, 237-238). 
5 In another embodiment, the present invention, can be employed in the diagnosis of a 

plurality of etiologic agents of a disease. An "etiologic agent" is herein defined as a pathogen 
acting as the causative agent of a disease. Diseases may be caused by a plurality of etiologic 
agents. For example, recent studies have implicated both human herpesvirus 6 (HHV-6) and the 
obligate intracellular bacterium Chlamydia pneumoniae in the etiology of multiple sclerosis 

10 (Swanborg, Microbes and Infection, 2002, 4, 1327-1333). The present invention can be applied 
to the identification of multiple etiologic agents of a disease by, for example, the use of broad 
range bacterial intelligent primers and division-wide primers (if necessary) for the identification 
of bacteria such as Chlamydia pneumoniae followed by primers directed to viral housekeeping 
genes for the identification of viruses such as HHV-6, for example. 

1 5 In other embodiments of the invention, the methods disclosed herein can be used for 

detection and identification of pathogens in livestock. Livestock includes, but is not limited to, 
cows, pigs, sheep, chickens, turkeys, goats, horses and other farm animals. For example, 
conditions classified by the California Department of Food and Agriculture as emergency 
conditions in livestock (ww.cdfa.ca.gov/ahfss/ah/pdfs/CA_reportable_diseaseJist_ 

20 05292002.pdf) include, but are not limited to: Anthrax (Bacillus anthracis), Screwworm myiasis 
(Cochliomyia hominivorax or Chrysomya bezziand), African trypanosomiasis (Tsetse fly 
diseases), Bovine babesiosis (piroplasmosis), Bovine spongiform encephalopathy (Mad Cow), 
Contagious bovine pleuropneumonia (Mycoplasma mycoides mycoides small colony), Foot-and- 
mouth disease (Hoof-and-mouth), Heartwater (Cowdria ruminantiwn), Hemorrhagic septicemia 

25 (Pasteurella multocida serotypes B:2 or E:2), Lumpy skin disease, Malignant catarrhal fever 
(African type), Rift Valley fever, Rinderpest (Cattle plague), Theileriosis (Corridor disease, East 
Coast fever), Vesicular stomatitis, Contagious agalactia (Mycoplasma species), Contagious 
caprine pleuropneumonia (Mycoplasma capricolum capripneumoniae), Nairobi sheep disease, 
Peste des petits ruminants (Goat plague), Pulmonary adenomatosis (Viral neoplastic pneumonia), 

30 Salmonella abortus ovis, Sheep and goat pox, African swine fever, Classical swine fever (Hog 
cholera), Japanese encephalitis, Nipah virus, Swine vesicular disease, Teschen disease 
(Enterovirus encephalomyelitis), Vesicular exanthema, Exotic Newcastle disease (Viscerotropic 
velogenic Newcastle disease), Highly pathogenic avian influenza (Fowl plague), African horse 
sickness, Dourine (Trypanosoma equiperdum), Epizootic lymphangitis (equine blastomycosis, 
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equine histoplasmosis), Equine piroplasmosis (Babesia equi, B. caballf), Glanders (Farcy) 
(Pseudomonas mallei), Hendra virus (Equine morbtilivirus), Horse pox, Surra (Trypanosoma 
evansi), Venezuelan equine encephalomyelitis, West Nile Virus, Chronic wasting disease in 
cervids, and Viral hemorrhagic disease of rabbits (calicivirus) 

5 Conditions classified by the California Department of Food and Agriculture as regulated 

conditions in livestock include, but are not limited to: rabies, Bovine brucellosis (Brucella 
abortus), Bovine tuberculosis (Mycobacter/wn bovis), Cattle scabies (multiple types), 
Trichomonosis (Tritrichomonas fetus), Caprine and ovine brucellosis (excluding Brucella ovis), 
Scrapie, Sheep scabies (Body mange) (Psoroptes ovis), Porcine brucellosis (Brucella suis), 

10 Pseudorabies (Aujeszky's disease), Ornithosis (Psittacosis or avian chlamydiosis) (Chlamydia 
psittaci), Pullorum disease (Fowl typhoid) (Salmonella gallinarum and pullorum), Contagious 
equine metritis (Taylorella equigenitalis), Equine encephalomyelitis (Eastern and Western 
equine encephalitis), Equine infectious anemia (Swamp fever), Duck viral enteritis (Duck 
plague), and Tuberculosis in cervids. 

1 5 Additional conditions monitored by the California Department of Food and Agriculture 

include, but are not limited to: Avian tuberculosis (Mycobacterium avium), 
Echinococcosis/Hydatidosis (Echinococcus species), Leptospirosis, Anaplasmosis (Anaplasma 
marginale or A. centrale), Bluetongue, Bovine cysticercosis (Taenia saginata in humans), 
Bovine genital campylobacteriosis (Campylobacter fetus venerealis), Dermatophilosis 

20 (Streptothricosis, mycotic dermatitis) (Dermatophilus congolensis), Enzootic bovine leukosis 
(Bovine leukemia virus), Infectious bovine rhinotracheitis (Bovine herpesvirus-1), Johne's 
disease (Paratuberculosis) (Mycobacterium avium paratuberculosis), Malignant catarrhal fever 
(North American), Q Fever (Coxiella burnetii), Caprine (contagious) arthritis/encephalitis, 
Enzootic abortion of ewes (Ovine chlamydiosis) (Chlamydia psittaci), Maedi-Visna (Ovine 

25 progressive pneumonia), Atrophic rhinitis (Bordetella bronchiseptica, Pasteurella multocida), 
Porcine cysticercosis (Taenia solium in humans), Porcine reproductive and respiratory 
syndrome, Transmissible gastroenteritis (coronavirus), Trichinellosis (Trichinella spiralis), 
Avian infectious bronchitis, Avian infectious laryngotracheitis, Duck viral hepatitis, Fowl 
cholera (Pasteurella multocida), Fowl pox, Infectious bursal disease (Gumboro disease), Low 

30 pathogenic avian influenza, Marek's disease, Mycoplasmosis (Mycoplasma gallisepticum), 
-Equine influenza Equme "rhinopneumonitis (Equine herpesvirus-1), Equine viral arteritis, and 
Horse mange (multiple types). 

A key problem in determining that an infectious outbreak is the result of a bioterrorist 
attack is the sheer variety of organisms that might be used by terrorists. According to a recent 



WO 2004/060278 



33 



PCT/US2003/038761 



review (Taylor et al., Philos. Trans. R. Soc. Lond. B. Biol. Sci., 2001, 356, 983-989), there are 
over 1400 organisms infectious to humans; most of these have the potential to be used in a 
deliberate, malicious attack. These numbers do not include numerous strain variants of each 
organism, bioengineered versions, or pathogens that infect plants or animals. Paradoxically, most 
5 of the new technology being developed for detection of biological weapons incorporates a 
version of quantitative PCR, which is based upon the use of highly specific primers and probes 
designed to selectively identify specific pathogenic organisms. This approach requires 
assumptions about the type and strain of bacteria or virus which is expected to be detected. 
Although this approach will work for the most obvious organisms, like smallpox and anthrax, 

10 experience has shown that it is very difficult to anticipate what a terrorist will do. 

The present invention can be used to detect and identify any biological agent, including 
bacteria, viruses, fungi and toxins without prior knowledge of the organism being detected and 
identified. As one example, where the agent is a biological threat, the information obtained such 

. as the presence of toxin genes, pathogenicity islands and antibiotic resistance genes for example, 

15 is used to determine practical information needed for countermeasures. In addition, the methods 
can be used to identify natural or deliberate engineering events including chromosome fragment 
swapping, molecular breeding (gene shuffling) and emerging infectious diseases. The present 
invention provides broad-function technology that may be the only practical means for rapid 
diagnosis of disease caused by a biowarfare or bioterrorist attack, especially an attack that might 

20 otherwise be missed or mistaken for a more common infection. 

Bacterial biological warfare agents capable of being detected by the present methods 
include, but are not limited to, Bacillus anthracis (anthrax), Yersinia pestis (pneumonic plague), 
Franciscella tularensis (tularemia), Brucella suis, Brucella abortus, Brucella melitensis 
(undulant fever), Burkholderia mallei (glanders), Burkholderia pseudomalleii (melioidosis), 

25 Salmonella typhi (typhoid fever), Rickettsia typhii (epidemic typhus), Rickettsia prowasekii 
(endemic typhus) and Coxiella burnetii (Q fever), Rhodobacter capsulatus, Chlamydia 
pneumoniae, Escherichia coli, Shigella dysenteriae, Shigella flexneri, Bacillus cereus, 
Clostridium botulinum, Coxiella burnetti, Pseudomonas aeruginosa, Legionella pneumophila, 
and Vibrio cholerae. 

30 Besides 16S and 23S rRNA, other target regions suitable for use in the present invention 

for detection of bacteria include, but are not limited to, 5S rRNA and RNase P (Figure 3). 

Fungal biowarfare agents include, but are not limited to, Coccidioides immitis 
(Coccidioidomycosis), and Magnaporthe grisea. 
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Biological warfare toxin genes capable of being detected by the methods of the present 
invention include, but are not limited to, botulinum toxin, T-2 mycotoxins, ricin, staph 
enterotoxin B, shigatoxin, abrin, aflatoxin, Clostridium perfringens epsilon toxin, conotoxins, 
diacetoxyscirpenol, tetrodotoxin and saxitoxin. 
5 Parasites that could be used in biological warfare include, but are not limited to: Ascaris 

suum, Giardia lamblia, Cryptosporidium, and Schistosoma. 

Biological warfare viral threat agents are mostly RNA viruses (positive-strand and 
negative-strand), with the exception of smallpox. Every RNA virus is a family of related viruses 
(quasispecies). These viruses mutate rapidly and the potential for engineered strains (natural or 

10 deliberate) is very high. RNA viruses cluster into families that have conserved RNA structural 
domains on the viral genome (e.g., virion components, accessory proteins) and conserved 
housekeeping genes that encode core viral proteins mcluding, for single strand positive strand 
RNA viruses, RNA-dependent RNA polymerase, double stranded RNA helicase, chymotrypsin- 
Hke and papain-like proteases and methyltransferases. "Housekeeping genes" refers to genes that 

15 are generally always expressed and thought to be involved in routine cellular metabolism. 

Examples of (-)-strand RNA viruses include, but are not limited to, arenaviruses (e.g., 
sabia virus, lassa fever, Machupo, Argentine hemorrhagic fever, flexal virus), bunyavinises (e.g., 
hantavirus, nairovirus, phlebovirus, hantaan virus, Congo-crimean hemorrhagic fever, rift valley 
fever), and mononegavirales (e.g., filovirus, paramyxovirus, ebola virus, Marburg, equine 

20 morbillivirus). 

Examples of (+)-strand RNA viruses include, but are not limited to, picornaviruses (e.g., 
coxsackievirus, echovirus, human coxsackievirus A, human echovirus, human enterovirus, 
human poliovirus, hepatitis A virus, human parechovirus, human rliinovirus), astroviruses (e.g., 
human astrovirus), calciviruses (e.g., chiba virus, chitta virus, human calcfvirus, norwalk virus), 

25 nidovirales (e.g., human coronavirus, human torovirus), flavivimses (e.g., dengue virus 1-4, 
Japanese encephalitis virus, Kyanasur forest disease virus, Murray Valley encephalitis virus, 
Rocio virus, St Louis encephalitis virus, West Nile virus, yellow fever virus, hepatitis c virus) 
and togaviruses (e.g., Chikugunya virus, Eastern equine encephalitis virus, Mayaro virus, 
O'nyong-nyong virus, Ross River virus, Venezuelan equine encephalitis virus, Rubella virus, 

30 hepatitis E virus). The hepatitis C virus has a 5'-untranslated region of 340 nucleotides, an open 
reading frame encoding 9 proteins having 3010 amino acids arid" a 3 '-untranslated region of 240 
nucleotides. The 5'-UTR and 3'-UTR are 99% conserved in hepatitis C viruses. 

In one embodiment, the target gene is an RNA-dependent RNA polymerase or a 
helicase encoded by (+)-strand RNA viruses, or RNA polymerase from a (-)-strand RNA virus. 
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(+)-strand RNA viruses are double stranded RNA and replicate by RNA-directed RNA synthesis 
using RNA-dependent RNA polymerase and the positive strand as a template. Helicase unwinds 
the RNA duplex to allow replication of the single stranded RNA. These viruses include viruses 
from the family picornaviridae (e.g., poliovirus, coxsackievirus, echovirus), togaviridae (e.g., 
5 alphavirus, flavivirus, rubivirus), arenaviridae (e.g., lymphocytic choriomeningitis virus, lassa 
fever virus), cononaviridae (e.g., human respiratory virus) and Hepatitis A virus. The genes 
encoding these proteins comprise variable and highly conserved regions that flank the variable 
regions. 

In one embodiment, the method can be used to detect the presence of antibiotic 
10 resistance and/or toxin genes in a bacterial species. For example, Bacillus anthracis comprising a 
tetracycline resistance plasmid and plasmids encoding one or both anthracis toxins (pxOl and/or 
px02) can be detected by using antibiotic resistance primer sets and toxin gene primer sets. If the 
B. anthracis is positive for tetracycline resistance, men a different antibiotic, for example 
quinalone, is used. 

1 5 While the present invention has been described with specificity in accordance with 

certain of its embodiments, the following examples serve only to illustrate the invention and are 
not intended to limit the same. 

EXAMPLES 
20 Example 1: Nucleic Acid Isolation and PCR 

In one embodiment, nucleic acid is isolated from the organisms and amplified by PCR 
using standard methods prior to BCS determination by mass spectrometry. Nucleic acid is 
isolated, for example, by detergent lysis of bacterial cells, centrifugation and ethanol 
precipitation. Nucleic acid isolation methods are described in, for example, Current Protocols in 

25 Molecular Biology (Ausubel et al.) and Molecular Cloning; A Laboratory Manual (Sambrook et 
al.). The nucleic acid is then amplified using standard methodology, such as PCR, with primers 
which bind to conserved regions of the nucleic acid which contain an intervening variable 
sequence as described below. 

General Genomic DNA Sample Prep Protocol: Raw samples are filtered using Supor- 

30 200 0.2 urn membrane syringe filters (VWR International) . Samples are transferred to 1.5 ml 
eppendorf tubes pre-filled with 0.45 g of 0.7 mm Zirconia beads followed by the addition of 350 
ul of ATL buffer (Qiagen, Valencia, CA). The samples are subjected to bead beating for 10 
rninutes at a frequency of 19 1/s in a Retsch Vibration Mill (Retsch). After centrifugation, 
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samples are transferred to an S-block plate (Qiagen) and DNA isolation is completed with a 
BioRobot 8000 nucleic acid isolation robot (Qiagen). 

Swab Sample Protocol: Allegiance S/P brand culture swabs and collection/transport 
system are used to collect samples. After drying, swabs are placed in 17x100 mm culture tubes 
5 (VWR International) and the genomic nucleic acid isolation is carried out automatically with a 
Qiagen Mdx robot and the Qiagen QIAamp DNA Blood BioRobot Mdx genomic preparation kit 
(Qiagen, Valencia, CA). 

Example 2: Mass spectrometry 

10 FTICR Instrumentation: The FTICR instrument is based on a 7 tesla actively shielded 

superconducting magnet and modified Bruker Daltonics Apex II 70e ion optics and vacuum 
chamber. The spectrometer is interfaced to a LEAP PAL autosampler and a custom fluidics 
control system for high throughput screening applications. Samples are analyzed directly from 
96-well or 384-well microliter plates at a rate of about 1 sample/minute. The Bruker data- 

1 5 acquisition platform is supplemented with a lab-built ancillary NT datastation which controls the 
autosampler and contains an arbitrary waveform generator capable of generating complex rf- 
excite waveforms (frequency sweeps, filtered noise, stored waveform inverse Fourier transform 
(SWIFT), etc.) for sophisticated tandem MS experiments. For oligonucleotides in the 20-30-mer 
regime typical performance characteristics include mass resolving power in excess of 100,000 

20 (FWHM), low ppm mass measurement errors, and an operable m/z range between 50 and 5000 
m/z. 

Modified ESI Source: In sample-limited analyses, analyte solutions are delivered at 150 
nL/minute to a 30 mm i.d. fused-silica ESI emitter mounted on a 3-D micromanipulator. The ESI 
ion optics consists of a heated metal capillary, an rf-only hexapole, a skimmer cone, and an 

25 auxiliary gate electrode. The 62 cm rf-only hexapole is comprised of 1 mm diameter rods and is 
operated at a voltage of 380 Vpp at a frequency of 5 MHz. A lab-built electro-mechanical shutter 
can be employed to prevent the electrospray plume from entering the inlet capillary unless 
triggered to the "open" position via a TTL pulse from the data station. When in the "closed" 
position, a stable electrospray plume is maintained between the ESI emitter and the face of the 

30 shutter. The back face of the shutter arm contains an elastomeric seal that can be positioned to 
form a vacuum seal with the inlet capillary. When the seal is removed, a 1 mm gap between the 
shutter blade and the capillary inlet allows constant pressure in the external ion reservoir 
regardless of whether the shutter is in the open or closed position. When the shutter is triggered, 
a "time slice" of ions is allowed to enter the inlet capillary and is subsequently accumulated in 
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the external ion reservoir. The rapid response time of the ion shutter (< 25 ms) provides 
reproducible, user defined intervals during which ions can be injected into and accumulated in 
the external ion reservoir. 

Apparatus for Infrared Multiphoton Dissociation-. A 25 watt CW C0 2 laser operating at 

5 10.6 um has been interfaced to the spectrometer to enable infrared multiphoton dissociation 
(IRMPD) for oligonucleotide sequencing and other tandem MS applications. An aluminum 
optical bench is positioned approximately 1.5 m from the actively shielded superconducting 
magnet such that the laser beam is aligned with the central axis of the magnet Using standard 
IR-compatible mirrors and kinematic minor mounts, the unfocused 3 mm laser beam is aligned 

10 to traverse directly through the 3.5 mm holes in the trapping electrodes of the FTTCR trapped ion 
cell and longitudinally traverse the hexapole region of the external ion guide finally impinging 
on the skimmer cone. This scheme allows IRMPD to be conducted in an m/z selective manner in 
the trapped ion cell (e.g. following a SWIFT isolation of the species of interest), or in a 
broadband mode in the high pressure region of the external ion reservoir where collisions with 

15 neutral molecules stabilize IRMPD-generated metastable fragment ions resulting in increased 
fragment ion yield and sequence coverage. 

Example 3: Identification of Bioagents 

Table 2 shows a small cross section of a database of calculated molecular masses for. 

20 over 9 primer sets and approximately 30 organisms. The primer sets were derived from rRNA 
alignment. Examples of regions from rRNA consensus alignments are shown in Figures 1 A-1C. 
Lines with arrows are examples of regions to which intelligent primer pairs for PCR are 
designed. The primer pairs are >95% conserved in the bacterial sequence database (currently 
over 10,000 organisms). The intervening regions are variable in length and/or composition, thus 

25 providing the base composition "signature" (BCS) for each organism. Primer pairs were chosen 
so the total length of the amplified region is less than about 80-90 nucleotides. The label for each 
primer pair represents the starting and ending base number of the amplified region on the 
consensus diagram. 

Included in the short bacterial database cross-section in Table 2 are many well known 
30 pathogens/biowarfare agents (shown in bold/red typeface) such as Bacillus cmthracis or Yersinia 
pesiis as well as some of the bacterial organisms found commonly' in the "natural environment 
such as Streptomyces. Even closely related organisms can be distinguished from each other by 
the appropriate choice of primers. For instance, two low G+C organisms, Bacillus anthracis and 
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Staph aureus, can be distinguished from each other by using the primer pair defined by 
16SJ337 or 23S_855 (AM of 4 Da). 



Table 2: Cross Section Of A Database Of Calculated Molecular Masses 1 



Primer Red Ions — * 
Buq Name 


16S_971 


6S_1100 


6S_1337 


I6SJ294 


I6S_1228 


:3S_1021 


23S_855 


23S_193 


23S_115 


Acinetobacter catcoaceticus 

Bordetella bronehlaeptica 
Bortefia burgdorferi 
Brucella abortus 
Campylobacter Jejuni 
Chlamydia pnuemonlae 
Clostridium botullnum 










51295,4 


30299 






54999 
.56850,3., 


55923,1 


54387.9 


25447.5 




51295.4 


30295 


42651 


39560.5 


















3 19?0,5 


.56231,2 


.55621.1 


28440.7 


35852,9 


51295.4 




42029,9 


38941.4 


52524.6 


68098 


SS011 


















64386.9 


J2S061.8 


358569 


.5K74.? 


30594 






45732,5 


55000 


.53767.,.. 




3S86S 








38941 


54999 


Enterococcusfeecalls 
Escherichia coll 
FrandseOa tidarensls 


.56855,3 


543869 
54387.9 


-28444,7. 
28447 6 


35863.9 
35858,9 


.51296.4, 

51298.4 


-.80297 


41417.8 
42652 




56612.2 
56849,3 


53769 


84383 






, 51301 

51?98 


?Q3ff1 „ 








Haemophilus influenzae 
Klebsiella pneumoniae 
Legionella pneumophila 
Mycobacterium avium 
Mycobacterium leprae 


55820.1, 




28444,7 
28442.7 




51298.4 
512S7.4 


30298 


42656 
42655 


39560.5 
39562.5 


55613.1 


55618 


55626 


28446 


35857 


51303 










Si900,e, 

54380,9 


5«6?1,1 


28064,8 
..26064,8. 


..3S8.S8 S 9_ 
35860,9 


51B1S.S 
S1917.S 


30598 , 


42656., , 


38942,4 
39559,5 


56241.2 
56240-2 


Mycoplasma genitalium 
Mycoplasma pneumoniae 
Neisseria gonorrhoeae 


531437 


4S11S.4 


L29081,8 


35860.9 


51301.4^ 
50871 .3 


30299 
30294 


42658 
43264.1 


39558.fi 


56243,2 . 
56842.4, 




.45118,4 
54389,9. 


29061 ,a„ 
_ZS44J 




51302,4 


30294 
30298 


42849 
43.272 


39559,6 
39561. S 
39559 


56843,4 
55619 


Rickettsia prowazeMI 
Rickettsia rickettsn 
Salmonella typhlmurfum 
Shigella dysenterlae 
Staphylococcus aureus 
Streptomyces 
Treponema pallidum 
Vibrio cholerae 
Vibrio parahaemolyUcus 
Yersinia pestls 


5W9? 
58PM 


. 55621 


28448 


JM5J 


50677 
SQ679 


-3P293 


42650 
42548 


39559 


53139 
5J755 


..$5622 


S5005 






51301 


















51301 














28443.7., 












-514.664 


54389,9 
.56245.? 


59341.6 
55631,1 


J$44,5,7_., 


35858.9 
35851.0 


51300,4 
61297.4 


30290 


42034.9... 


.. 39563.5 
38939.4 


56864 3 
57473,4 




■ 55626 . 






52536 






36241 




54384,9 


55626,1 


28444.7 


34620 7 


-5QP64.2,.. ■ 


















51299 











5 'Molecular mass distribution of PCR amplified regions for a selection of organisms (rows) 
across various primer pairs (columns). Pathogens are shown in bold. Empty cells indicate 
presently incomplete or missing data. 

Figure 6 shows the use of ESI-FT-ICR MS for measurement of exact mass. The spectra 
from 46mer PCR products originating at position 1337 of the 16S rRNA from S. aureus (upper) 

10 and B. anthracis (lower) are shown. These data are from the region of the spectrum containing 
signals from the [M-8H+] 8 " charge states of the respective 5'-3' strands. The two strands di ff er 
by two (AT-»CG) substitutions, and have measured masses of 14206.396 and 14208.373 + 
0.010 Da, respectively. The possible base compositions derived from the masses of the forward 
and reverse strands for the B. anthracis products are listed in Table 3. 

15 Table 3: Possible base composition for B. anthracis products 



Calc. Mass 


Error 


BaseComp. 


142082935 


0.079520 


Al G17C10T18 


14208.3160 


0.056980 


A1G20C15T10 
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14208.3386 


0.034440 


Al G23C20T2 


14208.3074 


0.065560 


A6G11C3T26 


14208.3300 


0.043020 


A6G14C8T18 


14208.3525 


0.020480 


A6G17C13T10 


14208.3751 


0.002060 


A6G20C18T2 


14208.3439 


0.029060 


A11G8C1T26 


14208.3665 


0.006520 


A11G11C6T18 


14208.3890 


0.016020 


All G14 C11T10 


14208.4116 


0.038560 


All G17C16T2 


14208.4030 


0.029980 


A16 G8C4T18 


14208.4255 


0.052520 


A16 G11C9T10 


14208.4481 


0.075060 ; 


A16 G14 C14 T2 


14208.4395 


0.066480 


A21 G5C2T18 


14208.4620 


0.089020 


A21G8C7T10 


14079.2624 


0.080600 


A0G14C13T19 


14079.2849 


0.058060 


A0G17C18T11 


14079.3075 


0.035520 


AO G20 C23 T3 


14079.2538 


0.089180 


A5 G5 CI T35 


14079.2764 


0.066640 


A5G8 C6T27 


14079.2989 


0.044100 


A5 G11C11T19 


14079.3214 


0.021560 


A5G14 CI 6 Til 


14079.3440 


0.000980 


A5G17C21T3 


14079.3129 


0.030140 


A10G5C4T27 


14079.3354 


0.007600 


A10G8C9T19 


14079.3579 


0.014940 


A10 Gil C14T11 


14079.3805 


0.037480 


A10G14C19T3 


14079.3494 


0.006360 


A15G2C2T27 


14079.3719 


0.028900 


A15G5 C7T19 


14079.3944 


0.051440 


A15G8 C12T11 


14079.4170 


0.073980 


A15G11C17T3 .. ... 


14079.4084 


0.065400 


A20G2C5 T19 


14079.4309 


0.087940 


A20 G5 CIO T13 
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Among the 16 compositions for the forward strand and the 18 compositions for the reverse 
strand that were calculated, only one pair (shown in bold) are complementary, corresponding to 
the actual base compositions of the B. anthracis PCR products. 

5 Example 4: BCS of Region from Bacillus anthracis and Bacillus cereus 

A conserved Bacillus region from B. anthracis (A14G9C14T9) and B. cereus 
(A15G9C13T9) having a C to A base change was synthesized and subjected to ESI-TOF MS. The 
results are shown in Figure 7 in which the two regions are clearly distinguished using the method 
of the present invention (MW=14072.26 vs. 14096.29). 

10 

Example 5: Identification of additional bioagents 

In other examples of the present invention, the pathogen Vibrio cholera can be 
distinguished from Vibrio parahemolyticus with AM > 600 Da using one of three 16S primer sets 
shown in Table 2 (16S_971, 16S_1228 or 16SJ294) as shown in Table 4. The two mycoplasma 

15 species in the list (M. genitalium and M. pneumoniae) can also be distinguished from each other, 
as can the three mycobacteriae. While the direct mass measurements of amplified products can 
identify and distinguish a large number of organisms, measurement of the base composition 
signature provides dramatically enhanced resolving power for closely related organisms. In cases 
such as Bacillus anthracis and Bacillus cereus that are virtually indistinguishable from each 

20 other based solely on mass differences, compositional analysis or fragmentation patterns are used 
to resolve the differences. The single base difference between the two organisms yields different 
fragmentation patterns, and despite the presence of the ambiguous/unidentified base N at 
position 20 in B. anthracis, the two organisms can be identified. 

Tables 4a-b show examples of primer pairs from Table 1 which distinguish pathogens 

25 from background. 



Table 4a 



Organism name 


23S_855 


16S_1337 


23S_1021 


Bacillus anthracis 


42650.98 


28447.65 


30294.98 


Staphylococcus aureus 


42654.97 


28443.67 


30297.96 
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Table 4b 



Organism name 


16S.971 


16S_1294 


16S_1228 


Vibrio cholerae 


55625.09 


35856.87 


52535.59 


Vibrio parahaemotyticus 


54384.91 


34620.67 


50064.19 



Table 5 shows the expected molecular weight and base composition of region 
16S_1 100-1 188 in Mycobacterium avium and Streptomyces sp. 

5 

Table 5 



Region 


Organism name 


Length 


Molecular 
weight 


Base comp. 


16S_1 100-1 188 


Mycobacterium avium 


82 


25624.1728 


A16G32C18T16 


16S_1 100-1 188 


Streptomyces sp. 


96 


29904.871 


A17G38C27T14 



Table 6 shows base composition (single strand) results for 16S 1 100-1 188 primer 
amplification reactions different species of bacteria. Species which are repeated in the table 
10 (e.g., Clostridium botulinum) are different strains which have different base compositions in the 
16S_1 100-1 188 region. 



Table 6 



Organism name 


Base comp. 


Organism name 


Base comp. 


Mycobacterium avium 


Al6G32C J8 Ti6 


Vibrio cholerae 


A23G30C21T16 


Streptomyces sp. 


A17G38C27T14 


Aeromonas hydrophila 


A23G31C21T15 


Ureaplasma urealyticum 


A18CJ30C17T17 


Aeromonas sahnonicida 


A23G31C21T15 


Streptomyces sp. 


A19G36C24T18 


Mycoplasma genitalium 


A24G19C12T18 


Mycobacterium leprae 


A20G32C22T16 


Clostridium botulinum 


A24G25C18T20 


M. tuberculosis 


AioGsjChTk 


Bordetella bronchiseptica 


A24G26C 19 Ti4 


Nocardia asteroides 


A20G33C21T16 


Francisella tularensis 


A24G26C19T19 


Fusobacterium necroforum 


A21G26C22T18 


Bacillus anthracis 


A24G26C20T18 


Listeria monocytogenes 


A21G27C19T19 


Campylobacter jejuni 


A24G26C2oT 18 


Clostridium botulinum 


A21Q27C19T21 


Staphylococcus aureus 


A24G26C20T18 


Neisseria gonorrhoeae 


A21G28C21TI8 


Helicobacter pylori 


A24G26C20T19 


Bartonella quintana 


A21G30C22T16 


Helicobacter pylori 


A24G26C21T18 
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Enterococcus faecalis 


A22G27C20T19 


Moraxella catarrhalis 


A24G26C23T16 


Bacillus megaterium 


A22G28C20T18 


Haemophilus influenzae Rd 


A24G28C20T17 


Bacillus subtilis 


A22G2«C2lT|7 


Chlamydia trachomatis 


A24G28C21TU 


Pseudomonas aeruginosa 


A22G29C23T15 


Chlamydophila pneumoniae 


A24G28C21T16 


Legionella pneumophila 


A22G32C20T16 


C. pneumonia AR39 


A24G2gC2lTifi 


Mycoplasma pneumoniae 


A23 G20C 1 4T1 6 


Pseudomonas putida 


A24G29C21T16 


Clostridium botulinum 


A23G26C20T19 


Proteus vulgaris 


A24G30C21T15 


Enterococcus faeciunt 


A23G2SC2 1T1 8 


Yersinia pestis 


A24G30C21TU 


Acinetobacter calcoaceti 


A23G26C21T19 


Yersinia pseudotuberculos 


A14G30C21 Ti s 


Leptospira borgpeterseni 


A23G26C24T1S 


Clostridium botulinum 


A25G24C18T21 


Leptospira interrogans 


A23 G2<S C24T15 


Clostridium tetani 


A2SG25C18T20 


Clostridium perfringens 


A23G27C [9T1 9 


Francisella tularensis 


A25G25C 1 9T1 9 


Bacillus anthracis 


A23G27C20X18 


Acinetobacter calcoacetic 


A25G26C20T19 


Bacillus cereus 


A23G27 C20T1 8 


Bacteriodes fragilis 


A2SG27C 16T22 




A23G27C20T] g 


Chlamydophila psittaci 


A25G27C2iTj6 


Aeromonas hydrophila 


A23G29C21T16 


Borrelia burgdorferi 


A25G29C17T19 


Escherichia coli 


A23G29C21T16 


Streptobacillus monilifor 


A26G26C20T16 


Pseudomonas putida 


A23G29C21T17 


Rickettsia prowazekii 


A^GasCuTtt 


Escherichia coli 


A23G29C22T15 


Rickettsia rickettsii 


A26G28C20T16 


Shigella dysenteriae 


A23G2jC22T 1S 


Mycoplasma mycoides 


A28G23C16T20 



The same organism having different base compositions are different strains. Groups of 
organisms which are highlighted or in italics have the same base compositions in the amplified 
region. Some of these organisms can be distinguished using multiple primers. For example, 

5 Bacillus anthracis can be distinguished from Bacillus cereus and Bacillus thuringiensis using the 
primer 16S_971-1062 (Table 7). Other primer pairs which produce unique base composition 
signatures are shown in Table 6 (bold). Clusters containing very similar threat and ubiquitous 
non-threat organisms (e.g. anthracis cluster) are distinguished at high resolution with focused 
sets of primer pairs. The known biowarfare agents in Table 6 are Bacillus anthracis, Yersinia 

10 pestis, Francisella tularensis and Rickettsia prowazekii. _ 
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Table 7 



Organism 


16S_971-1062 


16S_1228-1310 


16S_1100-1188 


Aeromonas hydrophila 


A21G29C22T20 


A22G27C21T13 


A23G31C21T15 


Aeromonas salmonicida 


A21G29C22T20 


A22G27C21T13 


A23G31C21T1S 


Bacillus anthracis 


A21G27C22T22 


A24G22Ci 9 T, 8 


A23G27C20T18 


Bacillus cereus 


A22G27C21T22 


A24G22C19T18 


A23G27C20T18 


Bacillus thuringiensis 


A22G27C21T22 


A24G22C19T18 


A23G27C20T18 


Chlamydia trachomatis 


A22G26C20T23 


A24G23C19T16 


A24G28C2jTi 6 


Chlamydia pneumoniae AR39 


A26G23C20T22 


A26G22C16T18 


A24G28C21T16 


Leptospira borgpetersenii 


A22G26C20T21 


A22G25C21T15 


A23G26C24T1J 


Leptospira Interrogans 


A22G26C20T21 


A22G25C21T15 


A23G26C24T15 


Mycoplasma genitalium 


A28G23C15T22 


A30G18C15T19 


A24G19C12T18 


Mycoplasma pneumoniae 


A28G23C15T22 


A27G19C16T20 


A23G20C14T16 


Escherichia colt 


A22G28Q0T22 


A24G25C21T13 


A23G29C22T15 


Shigella dysenteriae 


A22G28C21T21 


A24G25C21T13 


A23G29C22T15 


Proteus vulgaris 


A23G26C22T21 


AWG24C19T14 


A24G30C21T15 


Yersinia pestis 


A24G25C21T22 


A25G24C20T14 


A24G30C21T15 


Yersinia pseudotuberculosis 


A24G25C21T22 


A2SG24C20T14 


A24G 3 oC2!T,s 


Francisella tularensis 


A20G2SC21T23 


A23G26C17T17 


A24G26C19T19 


Rickettsia prowazekii 


A2lG2fiC24T25 


A24G23C16T19 


A2 € G28C 18 Tj8 


Rickettsia rickettsii 


A21G26C2ST24 


A24G24C17T17 


A26G28C20T16 



The sequence of B. anthracis and B. cereus in region 16S_971 is shown below. Shown 
in bold is the single base difference between the two species that can be detected using the 
5 methods of the present invention. B. anthracis has an ambiguous base at position 20. 
B.anthracis_l6S_91l 

GCGAAGAACCUUACCAGGUMJUGACAUCCUCUGACAACCCUAGAGAUAGGGCUUC 
UCCUUCGGGAGCAGAGUGACAGGUGGUGCAUGGUU (SEQ ID NO:l) 
B.cereus_l6S_97l 

-10 GCGAAGAACCUUACCAGGUCUUGACAUCCUCUGAAAACeCUAGAGAUAGGGCUUC 
UCCUUCGGGAGCAGAGUGACAGGUGGUGCAUGGUU (SEQ ID NO:2) 
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Example 6: ESI-TOF MS' of sspE 56-mcr Pins Calibrant 

The mass measurement accuracy that can be obtained using an internal mass standard in 
the ESI-MS study of PCR products is shown in Fig.8. The mass standard was a 20-mer 
phosphorothioate oligonucleotide added to a solution containing a 56-mer PCR product from the 
5 B. anthracis spore coat protein sspE. The mass of the expected PCR product distinguishes B. 
emthracis from other species of Bacillus such as B. thuringtensis and B. cereus. 

Example 7: B. anthracis ESI-TOF Synthetic 16S_1228 Duplex 

An ESI-TOF MS spectrum was obtained from an aqueous solution containing 5 uM 
10 each of synthetic analogs of the expected forward and reverse PCR products from the nucleotide 
1228 region of the B. anthracis 16S rRNA gene. The results (Fig. 9) show that the molecular 
weights of the forward and reverse strands can be accurately determined and easily distinguish 
the two strands. The [M-21H*] 21 " and [M^OH*] 20 " charge states are shown. 

1 5 Example 8: ESI-FTICR-MS of Synthetic B. anthracis 16S_1337 46 Base Pair Duplex 

An ESI-FTICR-MS spectrum was obtained from an aqueous solution containing 5 \M 
each of synthetic analogs of the expected forward and reverse PCR products from the nucleotide 
1337 region of the B. anthracis 16S rRNA gene. The results (Fig. 10) show mat the molecular 
weights of the strands can be distinguished by this method. The [M-iefT] 16 " through [M- 
20 10H*] 10 " charge states are shown. The insert highlights the resolution that can be realized on the 
FTICR-MS instrument, which allows the charge state of the ion to be determined from the mass 
difference between peaks differing by a single 13C substitution. 

Example 9: ESI-TOF MS of 56-mer Oligonucleotide from saspB Gene of B. anthracis with 
25 Internal Mass Standard 

ESI-TOF MS spectra were obtained on a synthetic 56-mer oligonucleotide (5 uM) from 
the saspB gene of B. anthracis containing an internal mass standard at an ESI of 1 .7 uL/min as a 
function of sample consumption. The results (Fig. 1 1) show that the signal to noise is improved 
as more scans are summed, and that the standard and the product are visible after only 100 scans. 

30 

Example 10: ESI-TOF MS of an Internal Standard with Triburylammonium (TBA)- 
trifluoroacetate (TFA) Buffer 
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An ESI-TOF-MS spectrum of a 20-mer phosphorothioate mass standard was obtained 
following addition of 5 mM TBA-TFA buffer to the solution. This buffer strips charge from the 
oligonucleotide and shifts the most abundant charge state from [M-8H*] 8 ' to [M-3rf] 3 " (Fig. 12). 

5 Example 11: Master Database Comparison 

The molecular masses obtained through Examples 1-10 are compared to molecular 
masses of known bioagents stored in a master database to obtain a high probability matching 
molecular mass. 



10 Example 12: Master Data Base Interrogation over the Internet 

The same procedure as in Example 1 1 is followed except that the local computer did not 
store the Master database. The Master database is interrogated over an internet connection, 
searching for a molecular mass match. 

15 Example 13: Master Database Updating 

The same procedure as in example 1 1 is followed except the local computer is 
connected to the internet and has the ability to store a master database locally. The local 
computer system periodically, or at the user's discretion, interrogates the Master database, 
synchronizing the local master database with the global Master database. This provides the 
20 current molecular mass information to both the local database as well as to the global Master 
database. This further provides more of a globalized knowledge base. 



Example 14: Global Database Updating 

The same procedure as in example 13 is followed except there are numerous such local 
25 stations throughout the world. The synchronization of each database adds to the diversity of 
information and diversity of the molecular masses of known bioagents. 



Example 15: Demonstration of Detection and Identification of Five Species of Bacteria in a 
Mixture 

30 Broad range intelligent primers were chosen following analysis of a large collection of 

curated bacterial 16S rRNA sequences representing greater than 4000 species of bacteria. 
Examples of primers capable of priming from greater than 90% of the organisms in the 
collection include, but are not limited to, those exhibited in Table 8 wherein Tp = 5'propynylated 
uridine and Cp = 5'propynylated cytidine. 
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Table 8 



Intelligent Primer Pairs for Identification of Bacteria 



Primer 
Pair Name 


Forward Primer 
Sequence 


Forward 
SEQ ID 

NO: 


Reverse Primer 
Sequence 


Reverse 
SEQ ID 
NO: 


16S EC 107 
7_1195 


3TGAGATGTTGGGTTAAGTCCC 
GTAACGAG 








16S EC L08 
2 1197 


^TGTTGGGTTAAGTCCCGCAAC 
GAG 




TTGACGTCATCCCCACCTTCCT 
C 




16S EC 109 
0 1196 


TTAAGTCCCGCAACGATCGCAA 




TGACGTCATCCCCACCTTCCTC 




16S EC 122 
2 1323 


GCTACACACGTGCTACAATG 




2GAGTT GCAGACTGCGATCCG 




16S EC 133 
2 1407 


AAGTCGGAATCGCTAGTAATCG 




3ACGGGCGGTGTGTACAAG 




16S EC 30 
126 


TGAACGCT GGTGGCATGCTTAA 
CAC 




TACGCATTACTCACCCGTCCGC 




16S EC 38 
~120 


GTGGCATGCCTAATACATGCAA 
GTCG 


20 


TTACTCACCCGTCCGCCGCT 


21 


16S EC 49 
120 


TAACACATGCAAGTCGAACG 




TTACTCACCCGTCCGCC 




16S EC 683 
795 


GTGTAGCGGTGAAATGCG 




GTATCTAATCCTGTTTGCTCCC 




16S EC 713 
"809 


AGAACACCGATGGCGAAGGC 




CGTGGACTACCAGGGTATCTA 




16S EC 785 
897 


GGATTAGAGACCCTGGTAGTCC 




GGCCGTACTCCCCAGGCG 




16S EC 785 
_897_2 


GGATTAGATACCCTGGTAGTCC 
ACGC 




GGCCGTACTCCCCAGGCG 




16S EC 789 
894 


TAGATACCCTGGTAGTCCACGC 


32 


CGTACT CCCCAGGCG 


33 


16S EC 960 
1073 


TT C GAT GC AACGCG AAGAACC T 








16S EC 969 
1078 


ACGCGAAGAACCTTACC 


36 


ACGACACGAGCTGACGAC 


37 


23S EC 182 
6 1924 


CTGACACCTGCCCGGTGC 


38 


GACCGTTATAGTTACGGCC 


39 


23S EC 264 
5 2761 


TCTGTCCCTAGTACGAGAGGAC 
CGG 


40 


TGCTTAGATGCTTTCAGC 


41 


23S_EC_264 
5 2767 


CTGTCCCTAGTACGAGAGGACC 


42 


GTTTCATGCTTAGATGCTTTCA 


43 


23S EC 493 
571 


GGGGAGTGAAAGAGATCCTGAA 
ACCG 


44 


ACAAAAGGTACGCCGTCACCC 


45 


23S EC 493 
_571_2 


GGGGAGTGAAAGAGATCCTGAA 
ACCG 


46 


ACAAAAGGCACGCCATCACCC 


47 


23S EC 971 
1077 


CGAGAGGGAAACAACCCAGACC 


48 


TGGCTGCTTCTAAGCCAAC 


49 


INFB EC 13 
65_1467 


TGCTCGTGGTGCACAAGTAACG 
GAT ATT A 


50 


TGCTGCTTTCGCATGGTTAATT 
GCTTCAA 


51 


RPOC EC 10 
18_1124 


CAAAACTTATTAGGTAAGCGTG 
TTGACT 


.52 


TCAAGCGCCATTTCTTTTGGTA 
AACCACAT 


.-. -53.. 


RPOC EC 10 
18_1124_2 


CAAAACTTATTAGGTAAGCGTG 
TTGACT 


54 


TCAAGCGCCATCTCTTTCGGTA 
ATCCACAT 


55 


RPOC EC 11 
4_232 


TAAGAAGCCGGAAACCATCAAC 
TACCG 


56 


GGCGCTTGTACTTACCGCAC 


57 



WO 2004/060278 



47 



PCT/US2003/038761 



RPOC EC 21 
78 2246 


fGATTCTGGTGCCCGTGGT 




'TGGCCATCAGGCCACGCATAC 


59 


RPOC EC 21 
78_2246_2 


rGATTCCGGTGCCCGTGGT 


60 


rTGGCCATCAGACCACGCATAC 


61 


RPOC EC 22 
18 2337 


CTGGCAGGTATGCGTGGTCTGA 
TG 


62 


CGCACCGTGGGTTGAGATGAAG 

rAc 


63 


RPOC EC 22 
18_2337_2 


CTTGCTGGTATGCGTGGTCTGA 
TG 


64 


CGCACCATGCGTAGAGATGAAG 
TAC 


65 


RPOC EC 80 
8^889 


CGTCGGGTGATTAACCGTAACA 
ACCG 


66 


GTTTTTCGTTGCGTACGATGAT 
GTC 


67 


RPOC EC 80 
8_891 


CGTCGTGTAATTAACCGTAACA 
ACCG 


68 


ATGCT 


69 


RPOC EC 99 
3JL059 


CAAAGGTAAGCAAGGTCGTTTC 
CGTCA 




AACGGCCT GAGTAGTC AACA 

CG 




RPOC EC 99 
3_1059_2 


CAAAGGTAAGCAAGGACGTTTC 
CGTCA 


72 


CGAACGGCCAGAGTAGTCAACA 
CG 


73 


TOFB EC 23 
9 303 


TAGACTGCCCAGGACACGCTG 


74 


GCCGTCCATCTGAGCAGCACC 


75 


TOFB EC 23 
9 303 2 


TTGACTGCCCAGGTCACGCTG 




3CCGTCCATTTGAGCAGCACC 




TOFB EC 97 
6 1068 


AACTACCGTCCGCAGTTCTACT 
TCC 


78 


GTTGTCGCCAGGCATAACCATT 
TC 


79 


TUFB EC 97 
6_1068_2 


AACTACCGTCCTCAGTTCTACT 
TCC 


80 


GTTGTCACCAGGCATTACCATT 
TC 


81 


TUFB EC 98 
5_1062 


CCACAGTTCTACTTCCGTACTA 
CTGACG 


82 


TCCAGGCATTACCATTTCTACT 
CCTTCTGG 


83 


RPLB EC 65 
0~762~ 


GACCTACAGTAAGAGGTTCTGT 
AATGAACC 


84 


TCCAAGTGCTGGTTTACCCCAT 
GG 


85 


RPLB EC 68 
8 _ 757~ 


CATCCACACGGTGGTGGTGAAG 
G 


86 


GTGCTGGTTTACCCCATGGAGT 


87 


RPOC EC 10 
36J1126 


CGTGTTGACTATTCGGGGCGTT 
CAG 


38 


ATTCAAGAGCCATTTCTTTTGG 
TAAACCAC 


89 


RPOB EC 37 
623865 


TCAACAACCTCTTGGAGGTAAA 
GCTCAGT . 




TTTCTTGAAGAGTATGAGCTGC 
TCCGTAAG 




RPLB EC 68 
8_771 


CATCCACACGGTGGTGGTGAAG 
G 




TGTTTTGTATCCAAGTGCTGGT 
TTACCCC 




VALS EC 11 
05 1218 


CGTGGCGGCGTGGTTATCGA 




CGGTACGAACTGGATGTCGCCG 
TT 




RPOB EC 18 
45 1929 


TATCGCTCAGGCGAACTCCAAC 


96 


GCTGGATTCGCCTTTGCTACG 


97 


RPLB EC 66 
9_761 


TGTAATGAACCCTAAT GACCAT 
CCACACGG 


98 


CCAAGTGCTGGTTTACCCCATG 
GAGTA 


99 


RPLB EC 67 
1_762 


TAATGAACCCTAATGACCATCC 
ACACGGTG 




TCCAAGTGCTGGTTTACCCCAT 
GGAG 




RPOB EC 37 
75_3858 


CTTGGAGGTAAGTCTCATTTTG 
GTGGGCA 




nn t» a t a a rsp t c p iv p p n *r a & cp *p 
TGTAATGC 




VALS EC 18 
33 1943 


CGACGCGCTGCGCTTCAC 




GCGTTCCACAGCTTGTTGCAGA 
AG 




RPOB EC 13 
36 1455 


GACCACCTCGGCAACCGT 


106 


TT-CGCTCTCGGCCTGGCC 


107 


TUFB EC 22 
5_309 


GCACTATGCACACGTAGATTGT 
CCTGG 


108 


TATAGCACCATCCATCTGAGCG 
GCAC 


109 


DNAK EC 42 
8 522 


CGGCGTACTTCAACGACAGCCA 


110 


CGCGGTCGGCTCGTTGATGA 


111 
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VALS EC 19 ( 
20 1970 


:ttctgcaacaagctgtggaac 

3C 


112 


C CGGAGTTCATCAGCACGAAGC 


113 


TUFB EC 75 
7 867 


\AGACGACCTGCACGGGC 






115 


23S EC 264 
6 27 65 


3TGTTCTTAGTACGAGAGGACC 




'TCGTGCTTAGATGCTTTCAG 




16S EC 969 
_1078_3P 


ICGCGAAGAACCTTACpC 




LCGACACGAGCpTpGACGAC 




16S EC 972 
_1075_4P 


CGAAGAACpCpTTACC 


120 


\CACGAGCpTpGAC 


121 


16S EC 972 
1075 


CGAAGAACCTTACC 


122 


\CACGAGCTGAC 


123 


23S EC_- 
347 59 


CCTGATAAGGGTGAGGTCG 


124 


ACGTCCTTCATCGCCTCTGA 


125 


23S EC - 
7 450 


GTTGTGAGGTTAAGCGACTAAG 


126 


CTATCGGTCAGTCAGGAGTAT 


127 


23S EC - 
7 910 


GTTGTGAGGTTAAGCGACTAAG 


128 


TTGCATCGGGTTGGTAAGTC 


129 


23S EC 430 
1442 


ATACTCCTGACTGACCGATAG 


130 


AACATAGCCTTCTCCGTCC 


131 


23S EC 891 
1931 


GACTTACCAACCCGATGCAA 


132 


TACCTTAGGACCGTTATAGTTA 
CG 


133 


23S EC 142 
4 2494 


GGACGGAGAAGGCTATGTT 


134 


CCAAACACCGCCGTCGATAT 


135 


23S EC_190 
8 2852 


CGTAACTATAACGGTCCTAAGG 
TA 


136 


GCTTACACACCCGGCCTATC 


137 


23S_EC_247 
5~ 3209 


ATATCGACGGCGGTGTTTGG 


138 


GCGTGACAGGCAGGT ATT C 


139 


16S EC - 
60 525 


AGTCTCAAGAGTGAACACGTAA 


140 


GCTGCTGGCACGGAGTTA 


141 


16S EC 326 
1058 


GACACGGTCCAGACTCCTAC 


142 


CCATGCAGCACCTGTCTC 


143 


16S EC 705 
1512 


GATCTGGAGGAATACCGGTG 


144 


ACGGTTACCTTGTTACGACT 


145 


16S EC 126 
8 1775 


GAGAGCAAGCGGACCTCATA 


146 


CCTCCTGCGTGCAAAGC 


147 


GR01 EC 94 
1 1060 


TGGAAGATCTGGGTCAGGC 


148 


CAATCTGCTGACGGATCTGAGC 


149 


INFB EC 11 
03 1191 


GTCGTGAAAACGAGCTGGAAGA 


150 


CATGATGGTCACAACCGG 


151 


HFLB EC 10 
82 1168 


TGGCGAACCTGGTGAACGAAGC 


152 


CTTTCGCTTTCTCGAACTCAAC 
CAT 


153 


INFB EC 19 
69^205B 


CGTCAGGGTAAATTCCGTGAAG 
TTAA 


154 


AACTTCGCCTTCGGTCATGTT 


155 


GROL EC 21 
9~35(F 


GGTGAAAGAAGTTGCCTCTAAA 
GC 


156 


TTCAGGTCCATCGGGTTCATGC 
C 


157 


VALS EC 11 
05~1214 


CGTGGCGGCGTGGTTATCGA 


158 


ACGAACTGGATGTCGCCGTT 


159 


16S EC 556 
700 


CGGAATTACTGGGCGTAAAG 


160 


CGCATTTCACCGCTACAC 


161 


RPOC EC 12 
56 1315 


ACCCAGTGCTGCTGAACCGTGC 


162 


GTTCAAATGCCTGGATACCCA 


163 


16S EC 774 
894 


GGGAGCAAACAGGATTAGATAC 


164 


CGTACTCCCCAGGCG ."" 


" 165 


RPOC EC 15 
84 1643 


TGGCCCGAAAGAAGCTGAGCG 


166 


ACGCGGGCATGCAGAGATGCC 


167 


16S EC 108 
2 1196 


ATGTTGGGTTAAGTCCCGC 


168 


TGACGTCATCCCCACCTTCC 


169 
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16S EC 138 
9 1541 


CTTGTACACACCGCCCGTC 


170 


AAGGAGGTGATCCAGCC 


171 


16S EC 130 
3~1407 


CGGATTGGAGTCTGCAACTCG 


172 


GACGGGCGGTGTGTACAAG 


173 


23S EC 23 
~130 


GGTGGATGCCTTGGC 


174 


GGGTTTCCCCATTCGG 


175 


23S EC 187 
256 


GGGAACTGAAACATCTAAGTA 


176 


TTCGCTCGCCGCTAC 


177 


23S EC 160 
2 1703 


TACCCCAAACCGACACAGG 


178 


CCTTCTCCCGAAGTTACG 


179 


23S EC 168 
5 1842 


CCGTAACTTCGGGAGAAGG 


180 


CACCGGGCAGGCGTC 


181 


23S EC 182 
7 1949 


GACGCCTGCCCGGTGC 


182 


CCGACAAGGAATTTCGCTACC 


183 


23S EC 243 
4~2511 


AAGGTACTCCGGGGATAACAGG 
C 


184 


AGCCGACATCGAGGTGCCAAAC 


185 


23S EC 259 
9~2669 


GACAGTT C GGTCCCTATC 


186 


CCGGTCCTCTCGTACTA 


187 . 


23S EC 265 
3~2758 


TAGTACGAGAGGACCGG 


188 


TTAGATGCTTTCAGCACTTATC 


189 


23S BS - 
68 21 


AAACTAGATAACAGTAGACATC 
AC 


190 


GTGCGCCCTTTCTAACTT 


191 


16S EC 8 3 
~58 


AGAGTTTGATCATGGCTCAG 


192 


ACTGCTGCCTCCCGTAG 


193 


16S EC 314 
575 


CACTGGAACTGAGACACGG 


194 


CTTTACGCCCAGTAATTCCG 


195 


16S EC 518 
795 


CCAGCAGCCGCGGTAATAC 


196 




197 


16S EC 683 
~985 


GTGTAGCGGTGAAATGCG 


198 


GGTAAGGTTCTTCGCGTTG 


199 


16S EC 937 
1240 


AAGCGGTGGAGCATGTGG 


200 


ATTGTAGCACGTGTGTAGCCC 


201 


16S EC 119 
5 1541 


CAAGTCATCATGGCCCTTA 


202 


AAGGAGGTGATCCAGCC 


203 


16S EC 8 1 
541 


AGAGTTTGATCATGGCTCAG 


204 


AAGGAGGTGATCCAGCC 


205 


23S EC 183 
1 1936 


ACCTGCCCAGTGCTGGAAG 


206 


TCGCTACCTTAGGACCGT 


207 


16S EC 138 
7 1513 


GCCTTGTACACACCTCCCGTC 


208 


CACGGCTACCTTGTTACGAC 


209 


16S EC 139 
0 1505 


TTGTACACACCGCCCGTCATAC 


210 


CCTTGTTACGACTTCACCCC 


211 


16S EC 136 
7 1506 


TACGGTGAATACGTTCCCGGG 


212 


ACCTTGTTACGACTTCACCCCA 


213 


16S EC 804 
929 


ACCACGCCGTAAACGATGA 


214 


CCCCCGTCAATTCCTTTGAGT 


215 


16S EC 791 
904 


GATACCCTGGTAGTCCACACCG 


216 


GCCTTGCGACCGTACTCCC 


217 


16S EC 789 
899 


TAGATACCCTGGTAGTCCACGC 


218 


GCGACCGTACTCCCCAGG 


219 


16S_EC_109 


TAGTCCCGCAACGAGCGC 


220 


GACGTCATCCCCACCTTCCTCC 


221 


23S EC 258 
6~2677 


TAGAACGTCGCGAGACAGTTCG 


222 


AGTCCATCCCGGTCCTCTCG 


223 


HEXAMER EC 
61 362 


GAGGAAAGTCCGGGCTC 


224 


ATAAGCCGGGTTCTGTCG 


225 


RNASEP BS 
43 384 


GAGGAAAGTCCATGCTCGC 


226 


GTAAGCCATGTTTTGTTCCATC 


227 


RNASEP EC 


GAGGAAAGTCCGGGCTC 


228 


ATAAGCCGGGTTCTGTCG 


229 
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61 362 










YAED_TRNA_ 

ALA- 
RRNH EC 51 

3 49 


GCGGGATCCTCTAGAGGTGTTA 
AATAGCCTGGCAG 


230 


GCGGGATCCTCTAGAAGACCTC 
CTGCGT GCAAAGC 


231 


RNASEP SA 
31 379 


GAGGAAAGTCCATGCTCAC 


232 


ATAAGCCATGTTCTGTTCCATC 


233 


16S EC 108 
2 1541 


ATGTTGGGTTAAGTCCCGC 


234 


AAGGAGGTGATCCAGCC 


235 


16S EC 556 
795 


CGGAATTACTGGGCGTAAAG 


236 


GTATCTAATCCTGTTTGCTCCC 


237 


16S EC 108 
2_1196_10G 


ATGTTGGGTTAAGTCCCGC 


238 


TGACGTCATGCCCACCTTCC 


239 


16S EC 108 
2 1196 10G 
11G 


ATGTTGGGTTAAGTCCCGC 


240 


TGACGTCATGGCCACCTTCC 


241 


TRNA ILERR 
NH ASPRRNH 
EC 32 41 


GCGGGATCCTCTAGACCTGATA 
AGGGTGAGGTCG 


242 


GCGGGATCCTCTAGAGCGTGAC 
AGGCAGGTATTC 


243 


16S EC 969 
1407 


ACGCGAAGAACCTTACC 


244 


GACGGGCGGTGTGTACAAG 


245 


16S EC 683 
1323 


GTGTAGCGGTGAAATGCG 


246 


CGAGTTGCAGACTGCGATCCG 


247 


16S EC 49 
894 


TAACACATGCAAGTCGAACG 


248 


CGTACTCCCCAGGCG 


249 


16S EC 49 
1078 


TAACACATGCAAGTCGAACG 


250 


ACGACACGAGCTGACGAC 


251 


CYA BA 134 
9 1447 


ACAACGAAGTACAATACAAGAC 


252 


CTTCTACATTTTTAGCCATCAC 


253 


16S EC 109 
0_1196_2 


TTAAGTCCCGCAACGAGCGCAA 


254 


TGACGTCATCCCCACCTTCCTC 




16S EC 405 
_527 


TGAGTGATGAAGGCCTTAGGGT 
TGTAAA 


256 


CGGCTGCTGGCACGAAGTTAG 


257 


GROL EC 49 
6 596 


ATGGACAAGGTTGGCAAGGAAG 
G 


258 


TAGCCGCGGTCGAATTGCAT 


259 


GROL EC 51 
1_593 


AAGGAAGGCGTGATCACCGTTG 
AAGA 


260 


CCGCGGTCGAATTGCATGCCTT 
C 


261 


VALS EC 18 
35 1928 


ACGCGCTGCGCTTCAC 


262 


TTGCAGAAGTTGCGGTAGCC 


263 


RPOB EC 13 
34 1478 


TCGACCACCTGGGCAACC 


264 


ATCAGGTCGTGCGGCATCA 


265 


DNAK EC 42 
0 521 


CACGGTGCCGGCGTACT 


266 


GCGGTCGGCTCGTTGATGAT 


267 


RPOB EC 37 
76 3853 


TTGGAGGTAAGTCTCATTTTGG 
TGG 


268 


AAGCTGCACCATAAGCTTGTAA 
TGC 


269 


RPOB EC 38 
02 3885 


CAGCGTTTCGGCGAAATGGA 


270 


CGACTTGACGGTTAACATTTCC 
TG 


271 


RPOB EC 37 
99_3888 


GGGCAGCGTTTCGGCGAAATGG 
A 


272 


GTCCGACTTGACGGTCAACATT 
TCCTG 


273 


RPOC EC 21 
46_2245 


CAGGAGTCGTTCAACTCGATCT 
ACATGAT 


274 


ACGCCATCAGGCCACGCAT 


275 


ASPS EC -40 
'5 538 




276 ■ 


AGGGCACGAGGTAGTCGC • 


• 277 


RPOC EC 13 
74 1455 


CGCCGACTTCGACGGTGACC 


278 


GAGCATCAGCGTGCGTGCT 


279 


TOFB EC 95 
7 1058 


CCACACGCCGTTCTTCAACAAC 
T 


280 


GGCATCACCATTTCCTTGTCCT 
TCG 


281 
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16S EC 7 1 
22 


GAGAGTTTGATCCTGGCTCAGA 
ACGAA 


282 


TGTTACTCACCCGTCTGCCACT 


283 


VALS EC 61 
0~727 


ACCGAGCAAGGAGACCAGC 


284 


TATAACGCACATCGTCAGGGTG 
A 


285 



For evaluation in the laboratory, five species of bacteria were selected including three y- 
proteobacteria (E. coli, K. pneumoniae and P. auergiosa) and two low G+C gram positive 
bacteria (B. subtilitis and S. aureus). The identities of the organisms were not revealed to the 
5 laboratory technicians. 

Bacteria were grown in culture, DNA was isolated and processed, and PCR performed 
using standard protocols. Following PCR, all samples were desalted, concentrated, and analyzed 
by Fourier Transform Ion Cyclotron Resonance (FTICR) mass spectrometry. Due to the 
extremely high precision of the FTICR, masses could be measured to within 1 Da and 

10 unambiguously deconvoluted to a single base composition. The measured base compositions 
were compared with the known base composition signatures in our database. As expected when 
using broad range survey 16S primers, several phylogenetic near-neighbor organisms were 
difficult to distinguish from our test organisms. Additional non-ribosomal primers were used to 
triangulate and further resolve these clusters. 

15 An example of the use of primers directed to regions of RNA polymerase B (rpoB) is 

shown in Figure 19. This gene has the potential to provide broad priming and resolving 
capabilities. A pair of primers directed against a conserved region of rpoB provided distinct base 
composition signatures that helped resolve the tight enterobacteriae cluster. Joint probability 
estimates of the signatures from each of the primers resulted in the identification of a single 

20 organism that matched the identity of the test sample. Therefore a combination of a small 
number of primers that amplify selected regions of the 16S ribosomal RNA gene and a few 
additional primers that amplify selected regions of protein encoding genes provide sufficient 
information to detect and identify all bacterial pathogens. 

25 Example 16: Detection of Staphylococcus aureus in Blood Samples 

Blood samples in an analysis plate were spiked with genomic DNA equivalent of 10 3 
organisms/ml of Staphylococcus aureus. A single set of 16S rRNA primers was used for 
. amplification. Following PCR, all samples were desalted, concentrated, and analyzed by Fourier 

Transform Ion Cyclotron Resonance (FTICR) mass spectrometry. In each of the spiked wells, 
30 strong signals were detected which are consistent with the expected BCS of the S. aureus 
amplicon (Figure 20). Furthermore, there was no robotic carryover or contamination in any of 
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the blood only or water blank wells. Methods similar to this one will be applied for other 
clinically relevant samples including, but not limited to: urine and throat or nasal swabs. 

Example 17: Detection and Serotyping of Viruses 
5 The virus detection capability of the present invention was demonstrated in 

collaboration with Naval health officers using adenoviruses as an example. 

All available genomic sequences for human adenoviruses available in public databases 

were surveyed. The hexon gene was identified as a candidate likely to have broad specificity 

across all serotypes. Four primer pairs were selected from a group of primers designed to yield 
10 broad coverage across the majority of the adenoviral strain types (Table 9)jwherein Tp = 

5'propynylated uridine and Cp = 5'propynylated cytidine. 

Table 9 



Intelligent Primer Pairs for Serotyping of Adenoviruses 



Primer Pair 


rorward Primer 
Sequence 


SEQ ID 

NO: 


Sequence 


SEQ ID 

NO: 


HEX HAD7+4+2 
1 934 995 


AGACCCAATTACATTGGCTT 


286 


CCAGTGCTGTTGTAGTACAT 


287 


HEX HAD7+4+2 
1 976 1050 


ATGTACTACAACAGTACTGG 


288 


CAAGTCAACCACAGCATTCA 


289 


HEX HAD7+4+2 
1 970 1059 


GGGCTTATGTACTACAACAG 


290 


TCTGTCTTGCAAGTCAACCAC 


291 


HEX HAD7+3 7 
71 827 


GGAATTTTTTGATGGTAGAGA 


292 


TAAAGCACAATTTCAGGCG 


293 


HEX HAD4+16 
746 848 


TAGATCTGGCTTTCTTTGAC 


294 


ATATGAGTATCTGGAGTCTGC 


295 


HEX HAD7 509 
578 


GGAAAGACATTACTGCAGACA 


296 


CCAACTTGAGGCTCTGGCTG 


297 


HEX HAD4 121 
6 1289 


ACAGACACTTACCAGGGTG 


298 


ACTGTGGTGTCATCTTTGTC 


299 


HEX HAD21 51 
5 567 


TCACTAAAGACAAAGGTCTTCC 


300 


GGCTTCGCCGTCTGTAATTTC 


301 


HEX HAD 1342 
1469 


CGGATCCAAGCTAATCTTTGG 


302 


GGTATGTACTCATAGGTGTTG 
GTG 


303 


HEX HAD7+4+2 
1 934 995P 


AGACpCpCAATTpACpATpTGG 
CTT 


304 


CpCpAGTGCTGTpTpGTAGTA 
CAT 


305 


HEX HAD7+4+2 
1 976 1050P 


ATpGTpACTpACAACAGTACpT 
pGG 


306 


CAAGTpCpAACCACAGCATpT 
pCA 


307 


HEX HAD7+4+2 
1 970 1059P 


GGGCpTpTATpGTpACTACAAC 
pAG 


308 


TCTGTpCpTTGCAAGTpCpAA 
CCAC 


309 


HEX HAD7+3 7 
71 827P -• 


GGAATTpTpTpTpTGATGGTAG 
AGA 


310 


TAAAGCACAATpTpTpCpAGG 

CG " 


311 


HEX HAD4+16_ 
746 848P 


TAGATCTGGCTpTpTpCpTTTG 
AC 


312 


ATATGAGTATpCpTpGGAGTp 
CpTGC 


313 


HEX HAD 1342 
14 69P 


CGGATpCCAAGCpTAATCpTpT 
TGG 


314 


GGTATGTACTCATAGGTGTpT 
pGGTG 


315 


HEX HAD7+21+- 


AACAGACCCAATTACATTGGCT 


316 


GAGGCACTTGTATGTGGAAAG 


317 
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3 931 1645 


T 




G 




HEX HAD4+2 9 
25 1469 


ATGCCTAACAGACCCAATTACA 
1 


318 


TTCATGTAGTCGTAGGTGTTG 
G 


319 


HEX HAD7+21+ 
3 384 953 


CGCGCCTAATACATCTCAGTGG 
AT 


320 


AAGCCAATGTAATTGGGTCTG 
TT 


321 


HEX HAD4+2 3 
45 947 


CTACTCTGGCACTGCCTACAAC 


322 


ATGTAATTGGGTCTGTTAGGC 
AT 


323 


HEX HAD2 772 
865 


CAATCCGTTCTGGTTCCGGATG 
AA 


324 


CTTGCCGGTCGTTCAAAGAGG 
TAG 


325 


HEX HAD7+4+2 
1 73 179 




326 


CGGTCGGTGGTCACATC 


327 


HEX HAD7+4+2 
1 1 54 


ATGGCCACCCCATCGATG 


328 


CTGTCCGGCGATGTGCATG 


329 


HEX HAD7+4+2 
1 1612 1718 


GGTCGTTATGTGCCTTTCCACA 
T 


330 


TCCTTTCTGAAGTTCCACTCA 
TAGG 


331 


HEX HAD7+4+2 
1 2276 2368 


ACAACATTGGCTACCAGGGCTT 


332 


CCTGCCTGCTCATAGGCTGGA 
AGTT 


333 



These primers also served to clearly distinguish those strains responsible for most 
disease (types 3, 4, 7 and 21) from all others. DNA isolated from field samples known to contain 
adenoviruses were tested using the hexon gene PCR primers, which provided unambiguous 
5 strain identification for all samples. A single sample was found to contain a mixture of two viral 
DNAs belonging to strains 7 and 21. 

Test results (Figure 21) showed perfect concordance between predicted and observed 
base composition signatures for each of these samples. Classical serotyping results confirmed 
each of these observations. Processing of viral samples directly from collection material such as 
10 throat swabs rather than from isolated DNA, will result in a significant increase in throughput, 
eliminating the need for virus culture. 

Example 18: Broad Rapid Detection and Strain Typing of Respiratory Pathogens for 
Epidemic Surveillance 

1 5 Genome preparation: Genomic materials from culture samples or swabs were prepared 

using a modified robotic protocol using DNeasy™ 96 Tissue Kit, Qiagen). Cultures of 
Streptococcus pyogenes were pelleted and transferred to a 1 .5 mL tube containing 0.45 g of 0.7 
mm Zirconia beads (Biospec Products, Inc.). Cells were lysed by shaking for 10 minutes at a 
speed of 1 9 1/s using a MM300 Vibration Mill (Retsch, Germany). The samples were 

20 centrifuged for 5 min and the supematants transferred to deep well blocks and processed using 
the manufacture's protocol and a Qiagen 8000 BioRobot. 

PCR: PCR reactions were assembled using a Packard MPII liquid handling platform 
and were performed in 50 uL volume using 1.8 units each of Platinum Taq (Invitrogen) and 
Hotstart PFU Turbo (Stratagene) polymerases. Cycling was performed on a DNA Engine Dyad 
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(MJ Research) with cycling conditions consisting of an initial 2 min at 95°C followed by 45 
cycles of 20 s at 95°C, 1 5 s at 58°C, and 15 s at 72°C. 

Broad-range primers: PCR primer design for base composition analysis from precise 
mass measurements is constrained by an upper limit where ionization and accurate 
5 deconvolution can be achieved. Currently, this limit is approximately 140 base pairs. Primers 
designed to broadly conserved regions of bacterial ribosomal RNAs (16 and 23S) and the gene 
encoding ribosomal protein L3 (rpoC) are shown in Table 10. 



Table 10 
Broad Range Primer Pairs 



Target 
Gene 


Direction 


Primer 


SEQ ID 

MO 


Length of 
Amplicon 


16S_1 


F 


GGATTAGAGACCCTGGTAGTCC 


334 


116 


16S_1 


R 


GGCCGTACTCCCCAGGCG 


335 


116 


16S_2 


F 


TTCGATGCAACGCGAAGAACCT 


336 


115 


16S_2 


R 


ACGAGCTGACGACAGCCATG 


337 


115 


23S 


F 


TCTGTCCCTAGTACGAGAGGACCGG 


338 


118 


23S 


R 


TGCTTAGATGCTTTCAGC 


339 


118 


rpoC 


F 


CTGGCAGGTATGCGTGGTCTGATG 


340 


121 


rpoC 


R 


CGCACCGTGGGTTGAGATGAAGTAC 


341 


121 



10 

Emm-typing primers: The allelic profile of a GAS strain by Multilocus Sequencing 
Technique (MLST) can be obtained by sequencing the internal fragments of seven housekeeping 
genes. The nucleotide sequences for each of these housekeeping genes, for 212 isolates of GAS 
(78 distinct emm types), are available (www.mlstnet). This corresponds to one hundred different 

15 allelic profiles or unique sequence types, referred to by Enright et al. as ST1-ST100 (Enright et 
al., Infection and Immunity, 2001, 69, 2416-2427). For each sequence type, we created a virtual 
transcript by concatenating sequences appropriate to their allelic profile from each of the seven 
genes. MLST primers were designed using these sequences and were constrained to be within 
each gene loci. Twenty-four primer pairs were initially designed and tested against the sequenced 

20 GAS strain 700294. A final subset of six primer pairs Table 1 1 was chosen based on a theoretical 
calculation of minimal number of primer pairs that maximized resolution of between emm types. 
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Table 11 

Drill-Down Primer Pairs Used in Determining emm-type 



Target 
Gene 


Direction 


Primer 


SEQ ID 
HO 


Length of 
Anplicon 


gki 


F 


GGGGATTCAGCCATCAAAGCAGCTATTGA 


342 


116 


gki 


R 


CCAACCTTTTCCACAACAGAATCAGC 


343 


116 


gtr 


F 


CCTTACTTCGAACTATGAATCTTTTGGAA 
G 


344 


115 




R 


CCCATTTTTTCACGCATGCTGAAAATATC 






murJ 


F 


CGCAAAAAAATCCAGCTATTAGC 


346 


118 


murl 


R 


AAACTATTTTTTTAGCTATACTCGAACAC 


347 


118 


rautS 


F 


ATGATTACAATTCAAGAAGGTCGTCACGC 


348 


121 


nut5 


R 


TTGGACCTGTAATCAGCTGAATACTGG 


349 


121 


xpt 


F 


GATGACTTTTTAGCTAATGGTCAGGCAGC 


350 


122 


xpt 


R 


AATCGACGACCATCTTGGAAAGATTTCTC 


351 


122 


yqiL 


F 


GCTTCAGGAATCAATGATGGAGCAG 


352 


119 


yqiL 


R 


GGGTCTACACCTGCACTTGCATAAC 


353 


119 



Microbiology: GAS isolates were identified from swabs on the basis of colony 
5 morphology and beta-hemolysis on blood agar plates, gram stain characteristics, susceptibility to 
bacitracin, and positive latex agglutination reactivity with group A-specific antiserum. 

Sequencing: Bacterial genomic DNA samples of all isolates were extracted from freshly 
grown GAS strains by using QIAamp DNA Blood Mini Kit (Qiagen, Valencia, CA) according to 
the procedures described by the manufacture. Group A streptococcal cells were subjected to PCR 
1 0 and sequence analysis using emm-gene specific PCR as previously described (Beall et al., J. 
Clin. Micro., 1996, 34, 953-958; and Facklam et al., Emerg. Infect Dis., 1999, 5, 247-253). 
Homology searches on DNA sequences were conducted against known emm sequences present 
in (www.cdc.gov/ncidod^iotech/infotech_hp.htmI). For MLST analysis, internal fragments of 
seven housekeeping genes, were amplified by PCR and analyzed as previously described 
1 5 (Enright et al., Infection and Immunity 2001 , 69, 2416-2427). The emm-type was determined 
from comparison to the MLST database. 

Broad Range Survey/Drill-Down Process (100): For Streptococcus pyogenes, the 
objective was the identification of a signature of the virulent epidemic strain and determination 
of its emm-type. Emm-type information is useful both for treatment considerations and epidemic 
20 surveillance. A total of 5 1 throat swabs were taken both from healthy recruits and from 

hospitalized patients in December 2002, during the peak of a GAS outbreak at a military training 
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camp. Twenty-seven additional isolates from previous infections ascribed to GAS were also 
examined. Initially, isolated colonies were examined both from throat culture samples and throat 
swabs directly without the culture step. The latter path can be completed within 6-12 hours 
providing information on a significant number of samples rapidly enough to be useful in 

5 managing an ongoing epidemic. 

The process of broad range survey/drill-down (200) is shown in Figure 22. A clinical 
sample such as a throat swab is first obtained from an individual (201). Broad range survey 
primers are used to obtain amplification products from the clinical sample (202) which are 
analyzed to determine a BCS (203) from which a species is identified (204). Drill-down primers 

10 are then employed to obtain PCR products (205) from which specific information is obtained 
about the species (such as Emm-type) (206). 

Broad Range Survey Priming: Genomic regions targeted by the broad range survey 
primers were selected for their ability to allow amplification of virtually all known species of 
bacteria and for their capability to distinguish bacterial species from each other by base 

1 5 composition analysis. Initially, four broad-range PCR target sites were selected and the primers 
were synthesized and tested. The targets included universally conserved regions of 16S and 23S 
rRNA, and the gene encoding ribosomal protein L3 (rpoC). 

While there was no special consideration of Streptococcus pyogenes in the selection of 
the broad range survey primers (which were optimized for distinguishing all important pathogens 

20 from each other), analysis of genomic sequences showed that the base compositions of these 
regions distinguished Streptococcus pyogenes from other respiratory pathogens and normal flora, 
including closely related species of streptococci, staphylococci, and bacilli (Figure 23). ^ 
Drill Down Priming (Emm-Typing): In order to obtain strain-specific information about 
the epidemic, a strategy was designed to measure the base compositions of a set of fast clock 

25 target genes to generate strain-specific signatures and simultaneously correlate with emm-types. 
In classic MLST analysis, internal fragments of seven housekeeping genes (gki, gtr, murl, mutS, 
recP, xpt, yqiL) are amplified, sequenced and compared to a database of previously studied 
isolates whose emm-types have been determined (Horner et al. Fundamental and Applied 
Toxicology, 1997, 36, 147). Since the analysis enabled by the present embodiment of the present 

30 invention provides base composition data rather than sequence data, the challenge was to identify 
the targetregions-that provide the highest resolutionof species and least ambiguous emm- 
classification. The data set from Table 2 of Enrightet al. (Enright et al. Infection and Immunity, 
2001, 69, 2416-2427) to bioinformatically construct an alignment of concatenated alleles of the 
seven housekeeping genes from each of 212 previously emm-typed strains, of which 101 were 
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unique sequences that represented 75 distinct emm-types. This alignment was then analyzed to 
determine the number and location of the optimal primer pairs that would maximize strain 
discrimination strictly on base composition data. 

An example of assignment of BCSs of PCR products is shown in Figure 24 where PCR 
5 products obtained using the gtr primer (a drill-down emm-typing primer) from two different 
swab samples were analyzed (sample 12 - top and sample 10 - bottom). The deconvoluted ESI- 
FCTIR spectra provide accurate mass measurements of both strands of the PCR products, from 
which a series of candidate BCSs were calculated from the measured mass (and within the 
measured mass uncertainty). The identification of complementary candidate BCSs from each 

10 strand provides a means for unambiguous assignment of the BCS of the PCR product. BCSs and 
molecular masses for each strand of the PCR product from the two different samples are also 
shown in Figure 24. In this case, the determination of BCSs for the two samples resulted in the 
identification of the emm-type of Streptococcus pyogenes - sample 12 was identified as emm- 
type 3 and sample 10 was identified as emm-type 6. 

.15 The results of the composition analysis using the six primer pairs, 5'-emm gene 

sequencing and MLST gene sequencing method for the GAS epidemic at a military training 
i facility are compared in Figure 25. The base composition results for the six primer pairs showed 
a perfect concordance with 5'-emm gene sequencing and MLST sequencing methods. Of the 5 1 
samples taken during the peak of the epidemic, all but three had identical compositions and 

20 corresponded to emm-type 3. The three outliers, all from healthy individuals, probably represent 
non-epidemic strains harbored by asymptomatic carriers. Samples 52-80, which were archived 
from previous infections from Marines at other naval training facilities, showed a much greater 
heterogeneity of composition signatures and emm-types. 

25 Example 19: Base Composition Probability Clouds 

Figure 18 illustrates the concept of base composition probability clouds via a pseudo- 
four dimensional plot of base compositions of enterobacteria including Y. pestis, Y. 
psuedotuberculosis, S. typhimurium, S. typhi, Y. enterocolitica, E. coli K12, andE. colt 
0157:H7. In the plot of Figure 18, A, C and G compositions correspond to the x, y and z axes 
30 respectively whereas T compositions are represented by the size of the sphere at the junction of 
the x, y and z coordinates. There is no absolute requirement for having a particular hucleobase 
composition associated with a particular axis. For example, a plot could be designed wherein G, 
T and C compositions correspond to the x, y and z axes respectively whereas the A composition 
corresponds to the size of the sphere at the junction of the x, y and z coordinates. Furthermore, a 
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different representation can be made of the "pseudo fourth" dimension i.e.: other than the size of 
the sphere at junction of the x, y and z coordinates. For example, a symbol having vector 
information such as an arrow or a cone can be rotated at an angle mat varies proportionally with 
the composition of the nucleobase corresponding to the pseudo fourth dimension. The choice of 

5 axes and pseudo fourth dimensional representation is typically made with the aim of optimal 
visualization of the data being presented. 

A similar base composition probability cloud analysis has been presented for a series of 
viruses in U.S. provisional patent application Serial No. 60/4313 19, which is commonly owned 
and incorporated herein by reference in its entirety. In this base composition probability cloud 

10 analysis, the closely related Dengue virus types 1-4 are clearly distinguishable from each other. 
This example is indicative of a challenging scenario for species identification based on BCS 
analysis because RNA viruses have a high mutation rate, it would be expected to be difficult to 
resolve closely related species. However, as this example illustrates, BCS analysis, aided by 
base composition probability cloud analysis is capable of resolution of closely related viral 

15 species. 

A base composition probability cloud can also be represented as a three dimensional 
plot instead of a pseudo-four dimensional plot An example of such a three dimensional plot is a 
plot of G, A and C compositions correspond to the x, y and z axes respectively, while the 
composition of T is left out of the plot. Another such example is a plot where the compositions 
20 of all four nucleobases is included: G, A and C+T compositions correspond to the x, y and z axes 
respectively. As for the pseudo-four dimensional plots, the choice of axes for a three dimensional 
plot is typically made with the aim of optimal visualization of the data being presented. 

Example 20: Biochemical Processing of Large Amplification Products for Analysis by Mass 
25 Spectrometry 

In the example illustrated in Figure 26, a primer pair which amplifies a 986 bp region of 
the 16S ribosomal gene in E. coli (K12) was digested with a mixture of 4 restriction enzymes: 
BstNl, BsmFl, Bfal, hndNcol. Figure 26(a) illustrates the complexity of the resulting ESI- 
FTICR mass spectrum that contains multiple charge states of multiple restriction fragments. 
30 Upon mass deconvolution to neutral mass, the spectrum is significantly simplified and discrete 
oligonucleotide pairs are evident (Figure 26b). When base compositions are derived from the " 
masses of the restriction fragments, perfect agreement is observed for the known sequence of 
nucleotides 1-856 (Figure 26c); the batch of Ncol enzyme used in this experiment was inactive 
and resulted in a missed cleavage site and a 197-mer fragment went undetected as it is outside 
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the mass range of the mass spectrometer under me conditions employed. Interestingly however, 
both a forward and reverse strand were detected for each fragment measured (solid and dotted 
lines in, respectively) within 2 ppm of the predicted molecular weights resulting in unambiguous 
determination of the base composition of 788 nucleotides of the 985 nucleotides in the amplicon. 
5 The coverage map offers redundant coverage as both 5' to 3' and 3' to 5' fragments are detected 
for fragments covering the first 856 nucleotides of the amplicon. 

This approach is in many ways analogous to those widely used in MS-based proteomics 
studies in which large intact proteins are digested with trypsin, or other proteolytic enzyme(s), 
and the identity of the protein is derived by comparing the measured masses of the tryptic 

10 peptides with theoretical digests. A unique feature of this approach is that the precise mass 
measurements of the complementary strands of each digest product allow one to derive a de 
novo base composition for each fragment, which can in turn be "stitched together" to derive a 
complete base composition for the larger amplicon. An important distinction between this 
approach and a gel-based restriction mapping strategy is that, in addition to determination of the 

1 5 length of each fragment, an unambiguous base composition of each restriction fragment is 
derived. Thus, a single base substitution within a fragment (which would not be resolved on a 
gel) is readily observed using this approach. Because this study was performed on a 7 Tesla ESI- 
FTICR mass spectrometer, better than 2 ppm mass measurement accuracy was obtained for all 
fragments. Interestingly, calculation of the mass measurement accuracy required to derive 

20 unambiguous base compositions from the complementary fragments indicates that the highest 
mass measurement accuracy actually required is only 15 ppm for the 139 bp fragment 
(nucleotides 525-663). Most of the fragments were in the 50-70 bp size-range which would 
require mass accuracy of only ~50 ppm for unambiguous base composition determination. This 
level of performance is achievable on other more compact, less expensive MS platforms such as 

25 the ESI-TOF suggesting that the methods developed here could be widely deployed in a variety 
of diagnostic and human forensic arenas. 

This example illustrates an alternative approach to derive base compositions from larger 
PCR products. Because the amplicons of interest cover many strain variants, for some of which 
complete sequences are not known, each amplicon can be digested under several different 

30 enzymatic conditions to ensure that a diagnostically informative region of the amplicon is not 
obscured by a "blind spot" which arises from a mutation in a restriction site. The extent of 
redundancy required to confidently map the base composition of amplicons from different 
markers, and determine which set of restriction enzymes should be employed and how they are 
most effectively used as mixtures can be determined. These parameters will be dictated by the 
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extent to which the area of interest is conserved across the amplified region, the compatibility of 
the various restriction enzymes with respect to digestion protocol (buffer, temperature, time) and 
the degree of coverage required to discriminate one amplicon from another. 
Example 21: Identification of members of the Viral Genus Orthopoxvirus 

5 Primer sites were identified on three essential viral genes - the DNA-dependent 

polymerase (DdDp), and two sub-units of DNA-dependent RNA polymerases A and B (DdRpA 
and DdRpB). These intelligent primers designed to identify members of the viral genus 
Orthopoxvirus are shown in Table 12 wherein Tp = 5'propynylated uridine and Cp = 
5'propynylated cyudine. 

10 Table 12 

Intelligent Primer Pairs for Identification of members of the 



Viral Genus Orthopoxvirus 



Primer Pair 
Name 


Forward Primer 
Sequence 


Forward 
SEQ ID 
NO: 


Reverse Primer 
Sequence 


Reverse 
SEQ ID 
NO: 


A25L NC00161 
1 28 127 


GTACT 6AATCCGCCTAAG 


354 


GTGAATAAAGTATCGCCCTAA 
TA 


355 


A18R NC00161 
1 100 207 


GAAGTTGAACCGGGATCA 


356 


ATTATCGGTCGTTGTTAATGT 


357 


A18R NC00161 
1 1348 1445 


CTGTCTGTAGATAAACTAGGAT 
T 


358 


CGTTCTTCTCTGGAGGAT 


359 


E9L NC001611 
1119 1222 


CGATACTACGGACGC 


360 


CT T T AT GAATT ACT TT ACATA 
T 


361 


K8R NC001611 
221 311 


CTCCT CCAT CACT AGG AA 


362 


CTATAACATTCAAAGCTTATT 
G 


363 


A24R NC00161 
1 795 878 


CGCGArAATAGATAGTGCTAAA 
C 


364 


GCTTCCACCAGGTCATTAA 


365 


A25L NC00161 
1 28 127P 


GTACpTpGAATpCpCpGCpCpT 
AAG 


366 


GTGAATAAAGTATpCpGCpCp 
CpTpAATA 


367 


A18R NC00161 
1 100 207P 


GAAGTpTpGAACpCpGGGATCA 


368 


ATTATCGGTpCpGTpTpGTpT 
pAATGT 


369 


A18R NC00161 
1 1348 1445P 


CTGTpCpTpGTAGATAAACpTp 
AGGATT 


370 


CGTTCpTpTpCpTpCpTpGGA 
GGAT 


371 


E9L NC001611 
1119 1222P 


CGATACpTpACpGGACGC 


372 


CTTTATGAATpTpACpTpTpT 
pACATAT 


373 


K8R NC001611 
221 311P 


CTpCpCpTCpCpATCACpTpAG 
GAA 


374 


CTATAACATpTpCpAAAGCpT 
pTpATTG 


375 


A24R NC00161 
1 795 878P 


CGCGATpAATpAGATAGTpGCp 
TpAAAC 


376 


GCTTCpCpACpCAGGTpCATp 
TAA 


377 



As illustrated in Figure 27, members of die Orthopoxvirus genus group can be 
15. identified, distinguished from one another, and distinguished from other members of the 
Poxvirus family using a single pair of primers designed against the DdRpB gene. 

Since the primers were designed across regions of high conservation within this genus, 
the likelihood of missed detection due to sequence variations at these sites is minimized. Further, 
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none of the primers is expected to amplify other viruses or any other DNA, based on the data 
available in GenBank. This method can be used for all families of viral threat agents and is not 
limited to members of the Orthopoxvirus genus. 

5 Example 22: Identification of Viruses that Cause Viral Hemorrhagic Fevers 

In accordance with the present invention an approach of broad PCR priming across 
several different viral species is employed using conserved regions in the various viral genomes, 
amplifying a small, yet highly informative region in these organisms, and then analyzing the 
resultant amplicons with mass spectrometry and data analysis. These regions will be tested with 
10 live agents, or with genomic constructs thereof. 

Detection of RNA viruses will necessitate a reverse transcription (RT) step prior to the 
PCR amplification of the TIGER reporter amplicon. To maximize throughput and yield while 
minimizing the handling of the samples, commercial one-step reverse transcription polymerase 
chain reaction (RT-PCR) kits will be evaluated for use. If necessary, a one-step RT-PCR mix 
1 5 using our selected DNA polymerase for the PCR portion of the reaction will be developed. To 
assure there is no variation in our reagent performance all new lots of enzymes, nucleotides and 
buffers will be individually tested prior to use. 

Various modifications of the invention, in addition to those described herein, will be 
apparent to those skilled in the art from the foregoing description. Such modifications are also 
20 intended to fall within the scope of the appended claims. Each reference, web site, Genebank 
accession number, etc. cited in the present application is incorporated herein by reference in its 
entirety. The following US applications are incorporated herein by reference in their entirety: 
Serial No. 10/323,233 filed December 18, 2002, Serial No. 10/326,051 filed December 18, 2002, 
Serial No. 10/325,527 filed December 18, 2002, Serial No. 10/325,526 filed December 18, 2002, 
25 Serial No. 60/431,319 filed December 6, 2002, Serial No. 60/443,443 filed January 29, 2003, 
Serial No. 60/443,788 filed January 30, 2003, Serial No. 60/447,529 filed February 14, 2003, 
and Serial No. 60/501,926 filed September 11, 2003. 
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What is claimed is: 

1. A method of identifying a plurality of etiologic agents of disease in an individual 
comprising the steps of: 

amplifying at least one nucleic acid molecule obtained from a biological sample from 
5 the individual with a plurality of intelligent primers to obtain a plurality of amplification 
products corresponding to the plurality of etiologic agents; and 

determining the molecular masses of the plurality of amplification products, wherein the 
molecular masses identify die plurality of etiologic agents and wherein the intelligent primers are 
broad range survey primers, division-wide primers, drill-down primers, or any combination 
10 thereof. 

2. A method of claim 1 wherein identification of at least one of the plurality of etiologic 
agents is accomplished at the genus or species level, and the intelligent primers are broad range 
survey primers, division-wide primers, or any combination thereof. 

3. A method of claim 1 wherein a subspecies characteristic of at least one of the plurality 
1 5 of etiologic agents is obtained using drill-down primers. 

4. A method of claim 3 wherein the subspecies characteristic is serotype, strain type, sub- 
strain type, sub-species type, emm-type, presence of a bioengineered gene, presence of a toxin 
gene, presence of an antibiotic resistance gene, presence of a pathogenicity island, or presence of 
a virulence factor. 

20 5. A method of claim 1 wherein the molecular mass is determined by mass spectrometry. 

6. A method of claim 5 wherein the mass spectrometry is Fourier transform ion cyclotron 
resonance mass spectrometry (FTICR- MS), ion trap, quadrupole, magnetic sector, time of flight 
(TOF), Q-TOF, or triple quadrupole. 

7. A method of claim 1 wherein the molecular masses are used to determine the base 
25 compositions of the amplification products and wherein the base compositions identify the 

pathogen. 

8. A method of in sttico screening of intelligent primer sets for identification of a plurality 
of bioagents comprising the steps of: 

preparing a base composition probability cloud plot from a plurality of base 
30 composition signatures of the plurality of bioagents generated in silico; 

inspecting the base composition probability cloud plot for overlap of clouds from 
different bioagents; and 

selecting primer sets based on minimal overlap of the clouds. 

9. A method of performing epidemic surveillance comprising the steps of: 
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determining the molecular mass of the amplification product, wherein said molecular 
mass identifies the pathogen in the biological sample. 

28. A method of claim 27 wherein the pathogen is a bacterium, a virus, a protozoan, a 
parasite, a mold, or a fungus. 
5 29. A method of claim 27 wherein the biological sample is blood, mucus, hair, urine, 
breath, sputum, saliva, stool, nail, or tissue biopsy. 

30. A method of claim 27 wherein the biological sample is obtained from an animal. 

31. A method of claim 30 wherein the animal is a human. 

32. A method of claim 27 wherein the intelligent primers are broad range survey primers, 
10 division-wide primers, or drill-down primers. 

33. A method of claim 32 wherein identification of the pathogen is accomplished at the 
genus or species level, and wherein the intelligent primers are broad range survey primers or 
division-wide primers. 

34. A method of claim 32 wherein a subspecies characteristic of the pathogen is obtained 
15 using drill-down primers. 

35. A method of claim 34 wherein the subspecies characteristic is serotype, strain type, 
sub-strain type, sub-species type, emm-type, presence of a bioengineered gene, presence of a 
toxin gene, presence of an antibiotic resistance gene, presence of a pathogenicity island, or 
presence of a virulence factor. 

20 36. A method of claim 27 wherein the molecular mass is determined by mass spectrometry. 

37. A method of claim 36 wherein the mass spectrometry is Fourier transform ion cyclotron 
resonance mass spectrometry (FTICR- MS), ion trap, quadrupole, magnetic sector, time of flight 
(TOF), Q-TOF, or triple quadrupole. 

38. A method of claim 27 wherein the intelligent primers are targeted to ribosomal RNA or 
25 housekeeping genes. 

39. A method of claim 27 wherein the molecular mass is used to determine the base 
composition of said amplification product and wherein said base composition identifies said 
pathogen. 

40. An intelligent primer pair wherein each member of the pair has at least 70% sequence 
30 identity with the sequence of the corresponding member of any one of the following intelligent 

primer pair sequences: SEQ ID NOs: 8:9, 10:11, 12:13, 14:15, 16:17, 18:19, 20:21, 22:23, 24:25, 
2657, 2829, 30:31, 32:33, 34:35, 36:37, 38:39, 40:41, 42:43, 44:45, 46:47, 48:49, 50:51, 52:53, 
54:55, 56:57, 58:59, 60:61, 62:63, 64:65, 66:67, 68:69, 70:71, 72:73, 74:75, 76:77, 78:79, 80:81, 
82:83, 84:85, 86:87, 88:89, 90:91, 92:93, 94:95, 96:97, 98:99, 100:101, 102:103, 104:105, 
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106:107, 108:109, 110:111, 112:113, 114:115, 116:117, 118:119, 120:121, 122:123, 124:125, 
126:127, 128:129, 130:131, 132:133, 134:135, 136:137, 138:139, 140:141, 142:143, 144:145, 
146:147, 148:149, 150:151, 152:153, 154:155, 156:157, 158:159, 160:161, 162:163, 164:165, 
166:167, 168:169, 170:171, 172:173, 174:175, 176:177, 178:179, 180:181, 182:183, 184:185, 

5 186:187, 188:189, 190:191, 192:193, 194:195, 196:197, 198:199, 200:201, 202:203, 204:205, 
206:207, 208:209, 210:211, 212:213, 214:215, 216:217, 218:219, 220:221, 222523, 224:225, 
226:227, 228:229, 230:231, 232:233, 234:235, 236:237, 238:239, 240:241, 242:243, 244:245, 
246:247, 248:249, 250:251, 252:253, 254:255, 256:257, 258:259, 260:261, 262:263, 264:265, 
266:267, 268:269, 270:271, 272:273, 274:275, 276:277, 278:279, 280:281, 282:283, 284.285, 

10 286:287, 288:289, 290:291, 292:293, 294:295, 296:297, 298:299, 300:301, 302:303, 304:305, 
306:307, 308:309, 310:311, 312:313, 314:315, 316:317, 318:319, 320:321, 322:323, 324:325, 
326:327, 328:329, 330:331, 332:333, 334:335, 336:337, 338:339, 340:341, 342:343, 344:345, 
346:347, 348:349, 350:351, 352:353, 354:355, 356:357, 358:359, 360:361, 362:363, 364:365, 
366:367, 368:369, 370:371, 372:373, 374=375, or 376:377. 

15 41. The intelligent primer pair of claim 40 comprising at least one modified nucleobase. 
42. The intelligent primer pair of claim 41 wherein the modified nucleobase is 5- 
propynylcytidine or 5-propynyluridine. 

43; A bioagent identifying amplicon comprising an isolated polynucleotide of about 45 to 
about 150 nucleobases in length produced by the process of amplification of nucleic acid from a 
20 bioagent with a pair of intelligent primers wherein each intelligent primer is of a length of about 
12 to about 35 nucleobases, wherein the bioagent identifying amplicon provides identifying 
information about the bioagent 

44. The bioagent identifying amplicon of claim 43 wherein each member of the pair has at 
least 70% sequence identity with the sequence of the conesponding member of any one of the 

25 following intelligent primer pair sequences: SEQ ID NOs: 8:9, 10:1 1, 12:13, 14:15, 16:17, 18:19, 
20:21, 22:23, 24:25, 26:27, 28:29, 30:31, 32:33, 34:35, 36:37, 38:39, 40:41, 42:43, 44:45, 46:47, 
48:49, 50:51, 52:53, 54:55, 56:57, 58:59, 60:61, 62:63, 64:65, 66:67, 68:69, 70:71, 72:73, 74:75, 
76:77, 78:79, 80:81, 82:83, 84:85, 86:87, 88:89, 90:91, 92:93, 94:95, 96:97, 98:99, 100:101, 
102:103, 104:105, 106:107, 108:109, 110:111, 112:113, 114:115, 116:117, 118:119, 120:121, 

30 122:123, 124:125, 126:127, 128:129, 130:131, 132:133, 134:135, 136:137, 138:139, 140:141, 
142:143, 144:145, 146:147, 148:149, 150:151, 152:153,' 154:155, 156:157, 158:159, 160:161, 
162:163, 164:165, 166:167, 168:169, 170:171, 172:173, 174:175, 176:177, 178:179, 180:181, 
182:183, 184:185, 186:187, 188:189, 190:191, 192:193, 194:195, 196:197, 198:199, 200:201, 
202:203, 204:205, 206:207, 208:209, 210:211, 212:213, 214515, 216:217, 218:219, 220:221, 
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222:223, 224:225, 226:227, 228:229, 230:231, 232:233, 234:235, 236:237, 238:239, 240:241, 
242:243, 244:245, 246:247, 248:249, 250:251, 252:253, 254:255, 256:257, 258:259, 260:261, 
262563, 264:265, 266:267, 268:269, 270:271, 272:273, 274:275, 276577, 278:279, 280:281, 
282:283, 284:285, 286:287, 288:289, 290:291, 292:293, 294:295, 296:297, 298:299, 300:301; 
5 302:303, 304:305, 306:307, 308:309, 310:311, 312:313, 314:315, 316:317, 318:319, 320:321, 
322:323, 324:325, 326:327, 328:329, 330:331, 332:333, 334:335, 336:337, 338:339, 340:341, 
342:343, 344:345, 346:347, 348:349, 350:351, 352:353, 354:355, 356:357, 358:359, 360:361, 
362:363, 364:365, 366:367, 368:369, 370:371, 372:373, 374:375, or 376:377. 
45. A bioagent identifying amplicon for identification of a bacterium comprising an isolated 
10 polynucleotide of about 45 to about 150 nucleobases in length produced by the process of 
amplification of nucleic acid encoding ribosomal RNA from a bacterium with a pair of 
intelligent primers wherein each intelligent primer is of a length of about 12 to about 35 
nucleobases, wherein the bioagent identifying amplicon provides identifying information about 
the bioagent. 

15 46. The bioagent identifying amplicon of claim 45 wherein each member of the pair has at 
least 70% sequence identity with the sequence of the corresponding member of any one of the 
following intelligent primer pair sequences: SEQIDNOs: 8:9, 10:11, 12:13, 14:15, 16:17, 18:19, 
20:21, 22:23, 24:25, 26:27, 28:29, 30:31, 32:33, 34:35, 36:37, 38:39, 40:41, 42:43, 44:45, 46:47, 
48:49, 50:51, 52:53, 54:55, 56:57, 58:59, 60:61, 62:63, 64:65, 66:67, 68:69, 70:71, 72:73, 74:75, 

20 76:77, 78:79, 80:81, 82:83, 84:85, 86:87, 88:89, 90:91, 92:93, 94:95, 96:97, 98:99, 100:101, 
102:103, 104:105, 106:107, 108:109, 110:111, 112:113, 114:115, 116:117, 118:119, 120:121, 
122:123, 124:125, 126:127, 128:129, 130:131, 132:133, 134:135, 136:137, 138:139, 140:141, 
142:143, 144:145, 146:147, 148:149, 150:151, 152:153, 154:155, 156:157, 158:159, 160:161, 
162:163, 164:165, 166:167, 168:169, 170:171, 172:173, 174:175, 176:177, 178:179, 180:181, 

25 182:183, 184:185, 186:187, 188:189, 190:191, 192:193, 194:195, 196:197, 198:199, 200:201, 
202:203, 204:205, 206:207, 208:209, 210:211, 212:213, 214:215, 216:217, 218:219, 220:221, 
222:223, 224:225, 226:227, 228:229, 230:231, 232:233, 234:235, 236:237, 238:239, 240:241, 
242:243, 244:245, 246:247, 248:249, 250:251, 252:253, 254:255, 256:257, 258:259, 260:261, 
262:263, 264:265, 266:267, 268:269, 270:271, 272:273, 274:275, 276:277, 278:279, 280:281, 

30 282:283, 284:285, 334:335, 336:337, 338:339, 340:341, 342:343, 344:345, 346:347, 348:349, 
350:351, or 352:353. ' ~ ■ ' 

47. A bioagent identifying amplicon for identification of a virus comprising an isolated 
polynucleotide of about 45 to about 150 nucleobases in length produced by the process of 
amplification of nucleic acid encoding a viral housekeeping gene with a pair of intelligent 
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primers wherein each intelligent primer is of a length of about 12 to about 35 nucleobases, 
wherein the bioagent identifying amphcon provides identifying information about the bioagent. 
48. The bioagent identifying amphcon of claim 47 wherein each member of the pair has at 
least 70% sequence identity with the sequence of the corresponding member of any one of the 
5 following intelligent primer pair sequences: SEQ ID NOs: 286:287, 288:289, 290:291, 292:293, 
294:295, 296:297, 298:299, 300:301, 302:303, 304:305, 306:307, 308:309, 310:311, 312:313, 
314:315, 316:317, 318:319, 320:321, 322:323, 324:325, 326:327, 328:329, 330:331, 332:333, 
354:355, 356:357, 358:359, 360:361, 362:363, 364:365, 366:367, 368:369, 370:371, 372:373, 
374:375, or 376:377. 

10 49. The method of claim 48 wherein said viral housekeeping gene is hexon, DNA- 
dependent polymerase, DNA-dependent RNA polymerase A, or DNA-dependent RNA 
polymerase B. 
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FIG. 1A-1 
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FIG. 1A-4 
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FIG. 1C 
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FIG. 14 
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FIG. 16 
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FIG. 17 
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Figure 18 
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Figure 19 
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Figure 22 
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SEQUENCE LISTING 

<110> ISIS Pharmaceuticals, Inc. 
Ecker, David J. 
Griffey, Richard H. 
Sampath, Rangarajan 
Hofstadler, Steven 
McNeil, John 
Crooke, Stanley T. 
Blyn, Lawrence B. 
Ranken, Raymond 
Hall, Thomas A. 

<120> METHODS FOR RAPID IDENTIFICATION OF PATHOGENS IN HUMANS AND ANIMALS 

<130> IBIS0060-500WO (DIBIS-0044WO) 

<150> 60/431,319 
<151> 2002-12-06 

<150> 10/323,233 
<151> 2002-12-18 

<150> 10/326,051 
<151> 2002-12-18 

<150> 10/325,527 
<151> 2002-12-18 

<150> 10/325,526 
<151> 2002-12-18 

<150> 60/443,443 
<151> 2003-01-29 

<150> 60/443,788 
<151> 2003-01-30 

<150> 60/447,529 
<151> 2003-02-14 

<150> 60/501,926 
<151> 2003-09-11 
<160> 377 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1388 
<212> RNA 

<213> 16S rRNA Consensus Sequence 

<220> 

<221> misc_feature 

<222> 1-7, 15, 22-24, 36-38, 41, 42,44, 56, 59-90, 93, 97, 98, 109, 
110, 112-116, 118-120, 123-131, 134,136, 138-144, 149-155, 161, 
162, 164-177, 182-209, 212-220, 222-225, 227, 230, 231, 235, 236, 
241, 245, 246, 253, 255-257, 260, 261-263, 267, 269, 270, 278, 279, 
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281, 282, 284, 291, 294, 301, 306, 310, 329, 330, 335, 336-338, 
344, 345, 347, 350, 351, 355, 356, 357, 361,363, 364, 371, 372, 
373, 376, 379, 382-386, 388, 389, 394-396, 398-405, 408, 411-438, 
442, 443, 445-451, 453, 454, 458-460, 465, 469, 491, 495, 496, 499, 
504-506, 511, 512, 514, 524, 526-528, 530, 534, 537-544, 546-550, 
552, 556-562, 565, 569-578, 580-586, 589-595, 597, 601-606, 609, 
612-617, 621-624, 629, 633, 636, 639, 643, 645, 646, 648, 650, 654, 
658-660, 669-674, 678-683, 689, 691, 693-696, 704, 708, 713, 734, 
737, 738, 744, 746-754, 756-758, 760-782, 784-786, 791-793, 
796-800, 815, 816, 823-825, 834, 845, 848, 857, 859, 864, 869, 875, 
877, 878, 884, 886, 894-898, 903-917, 921, 922-948, 955, 961, 972, 
973, 978, 990, 1005-1013, 1015, 1017, 1019, 1021-1029, 1031, 1033, 
1037-1043, 1049-1051, 1053, 1054, 1057-1059, 1069, 1075, 1083, 1085, 
1089, 1094, 1096-1099, 1104, 1110, 1111, 1119-1123, 1127, 1128, 
1130, 1132, 1133, 1136, 1138, 1139-1141, 1143, 1144, 1146-1150, 
1154, 1157, 1159-1162, 1166-1170, 1172, 1173, 1176, 1181, 1183-1186, 
1195-1198, 1200, 1205, 1206, 1210, 1220-1222, 1227, 1229, 1231-1233, 
1244, 1249, 1266-1268, 1271, 1273, 1274, 1277-1285, 1288, 1289, 
1293-1304, 1306-1311, 1313-1322, 1324, 1326, 1329, 1330-1338, 1341, 
1345-1347, 1361, 1364, 1366-1368, 1372, 1373, 1375, 1378, 1388 



<223> n = A,T,C or G 
<400> 1 

nnnnnnnaga ggacnggcca gnnngaacgc 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
ccnnnnnnnn nggnanannn nnnngaaann 
gnnnnnnnnn nnnnnnnnnn nnnnnnnnng 
nggcnnacca agncnnngan nnnagcngnn 
nacggnccan acccacggga ggcagcagnn 
nannccgcgg nnngangang gnnnnngnng 
nnnnnnnnnn nnnnnnnnga cnnannnnnn 
cgcggaaacg nagghngcna gcgnnncgga 
nnnngnnnnn gnaaannnnn nngcnaacnn 
nnnnnnagng gnnnnnngaa nnnnggagng 
gcgaaggcnn nnnncggnnn nnnacgacnc 
gaacccggag ccangcnnaa acgngnnnnn 
nnannnaacg nnnaannnnn ccgccgggga 
ggggnccngc acaagcngng gagnagggna 
cannnnnnnn nnnnnnngan annnnnnnnn 
ngcgcagccg gnnggagngg ggaagcccgn 
nnnnnnnnng ngnaccnnnn nnnacgocnn 
aancncagnc ccangnnnng ggcncacacn 
ngnnannnnn agcnaancnn nnaaannnnn 
gaagnnggan cgcagaacgn nnacagnang 
ccgcannnca ngnnagnnnn nnnnnccnna 
nncnanggnn nnnnnnnnga ngggnnnaag 
acaccccn 

<210> 2 
<211> 2654 
<212> RNA 

<213> Artificial Sequence 
<220> 



ggcggnnngc nnanacagca agcgancgnn 60 
agnggcnnac ggggagaann cnnnnnannn 120 
nnnnnaaacc nnannnnnnn nnnnnnnaaa 180 
annnnnnnnn gnnnnanagn ngggnnggaa 240 
cgagaggnng nncngccaca nggnacgaga 300 
ggaannnnca aggnngnaan ncgannnagc 3 60 
aaannncnnn nnnnnganga nnnnnnnnnn 420 
nannaagnnn cggcnaacnc ggccagcagc 480 
nnangggcga aagngnnngn aggnggnnnn 540 
nnnnnnnncn nnnnnnacnn nnnnncngag 600 
ggnaancgna gananngnan gaanaccnnn 660 
nannnncgaa agcngggnag cnaacaggaa 720 
nnnngnnngn nnnnnnnnnn nnnnnnnnnn 780 
gacgnncgca agnnnaaacc aaangaagac 840 
acgangnnac gcgnanaacc accnnnnnga 900 
nnnnnnnnnn nnnnnnnnac agggngcagg 9 60 
aacgagcgca acccnnnnnn nnngncnanc 1020 
ngnnaannng gaggaaggng gggangacgc 1080 
ncacaaggnn nnnacanngn gnngcnannn 1140 
cnnagncgga ngnnnncgca accgnnnncn 1200 
nnncgggaaa cgcncgggnc gacacaccgc 1260 
agnnnnnnnn nnnncnnnnn ngnnnnnnnn 1320 
cgaacaagga nccnannnga anngnggngg 1380 
1388 
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<221> niisc_feature 

<222> 1-4, 8-12, 16, 18-22, 31, 35-40, 43, 47, 54, 56-62, 64, 65, 67-69, 
71, 72, 74-76, 79, 80, 83, 84, 86-93, 95-99, 101, 104-106, 108, 109, 
119, 125-142, 144-171, 173-175, 178-182, 186, 194, 198-202, 208, 214, 
215, 218-221, 224, 226-229, 233, 244-246, 248, 250, 251, 254, 256-280, 
282-284, 288-292, 295, 299-307,309-311, 314, 316, 318, 319, 322, 325-328, 
332-354, 358-360, 362-365, 367, 370, 372, 373, 375-378, 380, 381, 
385-387, 389-392, 398-403, 407, 414-420, 425, 429, 431, 433, 434, 439, 
443, 451, 458, 463, 464, 465, 466, 467, 469, 479, 482, 483, 492-496, 498, 
500, 503-505, 507-509, 512-525, 528, 529, 533, 537, 539, 540, 543-545, 
547, 549, 552-561, 564, 567, 570-578, 580, 583, 586, 589, 594, 599-601, 
604, 605, 607, 611-613, 616-620, 622-625, 630, 635, 637-639, 643, 646-648, 
651, 652, 657, 662-666, 670, 672-676, 682, 689-696, 703-708, 714, 715, 
718-720, 722, 725, 730, 731, 733, 736, 738, 742-744, 746, 747, 756, 757, 
763-766, 770-773, 776-791, 794,805-814, 817-829, 832, 833, 835-842, 847, 
852, 855-870, 872, 875, 876, 878, 879, 881-883, 885, 887, 889, 892-894, 
896-898, 900, 901, 903, 908, 913, 920, 922, 923, 925-927, 932, 936, 
939-946, 952, 956, 959, 962-967, 969, 970, 972, 976-978, 983, 999, 1001, 
1002, 1008, 1009, 1015, 1022, 1023, 1025, 1028-1034, 1039, 1042, 1043, 
1045, 1047, 1052, 1056-1063, 1069-1074, 1076-1097, 1102, 1103, 1109-1121, 
1126-1132, 1134, 1135, 1137-1143, 1147-1155, 1159, 1161, 1164, 1165, 1167, 
1168, 1170, 1174, 1178-1185, 1189, 1191, 1192, 1194-1198, 1200, 1204, 
1206-1208, 1210, 1215, 1218-1223, 1225, 1227-1229, 1231-1236, 1240, 
1245-1247, 1253, 1254, 1258, 1260, 1263,1265, 1267, 1268, 1271, .1272, 
1277, 1278, 1280-1282, 1285, 1286, 1291-1293, 1296-1316, 1321-1326, 
1328-1345, 1348-1455, 1457, 1458, 1464-1490, 1496, 1497, 1511, 1513-1516, 
1518, 1519, 1523, 1525, 1526, 1528, 1529, 1533, 1535-1537, 1539-1542, 
1545-1552, 1560, 1561, 1567-1571, 1576, 1581, 1583, 1588-1591, 1593-1633, 
1635-1638, 1640-1642, 1644-1646, 1648-1654, 1656, 1661, 1662, 1673, 1674, 
1676, 1677, 1680, 1683, 1684, 1687, 1691, 1692, 1695, 1699, 1702, 1703, 
1707, 1714, 1718, 1719, 1727, 1728, 1730-1738, 1740-1744, 1746-1756, 
1758-1760, 1767, 1768, 1770, 1779, 1780, 1789, 1790, 1820, 1828, 1831, 
1833, 1836, 1839-1846, 1851-1859, 1861, 1863, 1865, 1869-1871, 1873-1877, 
1879, 1886-1890, 1892, 1896-1900, 1915, 1916, 1918, 1920, 1924, 1925, 
1927-1932, 1934, 1936-1950, 1952, 1953, 1956, 1961, 1966, 1968, 1969, 
1970, 1973-1980, 1983, 1984, 1987-1993, 1998, 2000-2004, 2006, 2007, 2011, 
2014, 2016-2029, 2034-2044, 2046, 2048-2056, 2061, 2063-2065, 2067, 2068, 
2072, 2075, 2085, 2086, 2091, 2095, 2096, 2106, 2108, 2109, 2111, 
2116-2118, 2120, 2122-2125, 2128, 2129, 2132, 2133, 2136-2143, 2146, 2147, 
2150, 2151, 2153, 2155, 2159, 2160, 2161, 2164, 2165, 2169, 2170 
2173-2176, 2179-2182, 2190-2192, 2199, 2200, 2203-2205, 2214, 2217-2222, 
2228, 2232, 2248, 2251, 2253, 2266, 2268-2271, 2280, 2283, 2291-2294, 2311, 
2313, 2324, 2327, 2328, 2339, 2340, 2349, 2350, 2355, 2358, 2362, 2370, 
2372, 2386, 2394, 2396, 2397, 2399, 2401, 2403, 2405-2407, 2410-2412, 
2415-2417, 2419-2421, 2423, 2424, 2426, 2442-2444, 2446, 2449, 2450, 2452, 
2454, 2459-2461, 2463, 2466-2468, 2473-2475, 2479, 2480, 2483, 2485, 2486, 
2491-2494, 2497-2500, 2505, 2506, 2512, 2520-2522, 2526, 2528-2530, 
2532-2535, 2539, 2540, 2543-2545, 2547, 2549-2568, 2571-2573, 2575, 
2576-2579, 2583, 2584, 2586-2589, 2591, 2597-2599, 2601-2603, 2605, 
2609-2612, 2614, 2615, 2617, 2618-2622, 2625-2627, 2630-2632, 2637-2640, 
2642-2647, 2649-2654 



<223> n = A, T,C or G 
<400> 2 

annnaagnnn nnaagngnnn nnggggagcc nggcnnnnnn agncgangaa ggangnnnnn 60 
nncnncnnna nncnnnggnn agnngnnnnn nnncnnnnna nccnnngnnc cgaaggggna 120 
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acccnnnnnn nnnnnnnnnn nnannnnnnn nnnnnnnnnn nnnnnnnnnn ngnnnacnnn 180 
nngaangaaa cacnagannn nnaggaanag aaannaannn ngancnnnng agnggcgagc 240 
gaannngnan nagncnnnnn nnnnnnnnnn nnnnnnnnnn annngaannn nnggnaagnn 300 
nnnnnnnann nggnanannc cngannnnaa annnnnnnnn nnnnnnnnnn nnnnagannn 360 
cnnnncncgn gnnannnngn ngaannngnn nngaccannn nnnaagncaa aacnnnnnnn 420 
gaccnaagng nannagacng ganggaaagg ngaaaagnac ccnnnnnang ggaggaaana 480 
gnnccgaaac cnnnnncnan aannngnnna gnnnnnnnnn nnnnnganng cgnccgnann 540 
agnnncngng annnnnnnnn ngcnagnaan nnnnnnnngn agncgnagng aaancgagnn 600 
naanngngcg nnnagnnnnn gnnnnagacn cgaancnnng gancannnag nncaggngaa 660 
gnnnnngaan annnnnggag gnccgaacnn nnnnnnggaa aannnnnngg agannggnnn 720 
gnggngaaan ncnaancnaa cnnngnnaag cggccnncga aannnnaggn nnngcnnnnn 780 
nnnnnnnnnn nggnggagag cacgnnnnnn nnnnggnnnn nnnnnnnnna cnnannnnnn 840 
nnaaacncga anccnnnnnn nnnnnnnnnn gnagnnannc nnngngngna annncnnngn 900 
nanagggnaa cancccagan cnncnnnaag gncccnaann nnnnnnaagg gnaaangang 960 
gnnnnnncnn anacannnag gangggcaga agcagccanc nnaaagahng cgaanagcca 1020 
cnncnagnnn nnnngcgcng annanancgg gncaannnnn nnnccgaann nnnngnnnnn 1080 
nnnnnnnnnn nnnnnnngga gnngagcgnn nnnnnnnnnn ngaagnnnnn nngnnannnn 1140 
nnnggannnn nnnnnaggng nagnngnnan agancgannn nnnnnggana nncnnnnncn 1200 
ccgnannncn aaggnccnnn nnnangnnnc nnnnnngggn agcgnnncca agnngagncn 1260 
ganangnnag nngaggnnan nnggnnaacc nnnacnnnnn nnnnnnnnnn nnnnnngacg 1320 
nnnnnngnnn nnnnnnnnnn nnnnnggnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1380 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1440 
nnnnnnnnnn nnnnncnnga aaannnnnnn nnnnnnnnnn nnnnnnnnnn cgaccnnaaa 1500 
ccgacacagg ngnnnngnng agnanncnna ggngnnngnn nnaannnnnn nnaaggaacn 1560 
ngcaaannnn nccganccgg nanaaggnnn ncnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1620 
nnnnnnnnnn nnngnnnnan nnannngnnn nnnncnacga nnaaaaacac agnncnngcn 1680 
aanncgnaag nngangaang gnngacnccg cccnggcnng aaggaanngn nnnnnnnngn 1740 
nnnngnnnnn nnnnnnannn aagcccnngn aacggcggnn gaacaaacnn ccaaggagcg 1800 
aaaccgcggg aagccgaccn gcacgaangg ngnaangann nnnnnncgcc nnnnnnnnnc 1860 
ncngngaann nannnnngna agagcnnnnn cncgcnnnnn gacggaaaga ccccnngnan 1920 
cacnnannnn nnangnnnnn nnnnnnnnnn gnnagnaagg nggagncnnn gannnnnnnn 1980 
cgnnagnnnn nnnggagncn nnnngnnaac nacncnnnnn nnnnnnnnnc aacnnnnnnn 2040 
nnnnancnnn nnnnnngaca ngnnngnngg gnagnacggg gcggnncccc naaanngaac 2100 
ggaggngnnc naaggnnncn annnnggnng gnnacnnnnn nnnagnnaan ngnanaagnn 2160 
ngcnnacgnn agnnnnacnn nncgagcagn nncgaaagnn ggnnnaggac cggnggnnnn 2220 
nnggaagngc cncgccaacg gaaaaagnac ncnggggaaa caggcnannn ncccaagagn 2280 
canacgacgg nnnngggcac ccgagcggcc ncncaccggg gcgnagnngg cccaagggnn 2340 
ggcgcgccnn aaagnggnac gngagcgggn anaacgcgga gacagnggcc cacngnngng 2400 
ngngnnngan nngannngnn ngnncnagac gagaggaccg gnnngnacnn ancncgggnn 2460 
ncnggnnngc cannngcann gcngnnagca nnnnggnnnn gaaanngcga angcacaagn 2520 
nngaancnnn cnnnnagann agnnncncnn nnnnnnnnnn nnnnnnnnag nnncnnnnna 25B0 
gannannnng ngaaggnnng nnngnaagnn nngnnannnn nnagnnnacn nnacaannnn 2640 
cnnnnnncnn nnnn 2654 

<210> 3 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 3 

cgtggtgacc ctt 13 

<210> 4 
<211> 14 



4 



WO 2004/060278 



PCT/US2003/038761 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 4 

cgtcgtcacc gcta 14 

<210> 5 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 5 

cgtggtaccc ctt 13 

<210> 6 
<211> 90 
<212> RNA 

<213> Bacillus anthracis 
<220> 

<221> misc_feature 
<222> 20 

<223> n = A, U, C or G 
<400> 6 

gcgaagaacc uuaccaggun uugacauccu cugacaaccc uagagauagg gcuucuccuu 60 
cgggagcaga gugacaggug gugcaugguu 90 

<210> 7 
<211> 90 
<212> RNA 

<213> Bacillus cereus 
<400> 7 

gcgaagaacc uuaccagguc uugacauccu cugaaaaccc uagagauagg gcuucuccuu 60 
cgggagcaga gugacaggug gugcaugguu 90 

<210> 8 
<211> 30 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> PCR Primer 
<400> 8 

gtgagatgtt gggttaagtc ccgtaacgag 

<210> 9 
<211> 21 



30 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 9 

gacgtcatcc ccaccttcct c 

<210> 10 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> .10 

atgttgggtt aagtcccgca acgag 

<210> 11 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 11 

ttgacgtcat ccccaccttc etc 

<210> 12 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 12 

ttaagtcccg caacgatcgc aa 

<210> 13 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 13 

tgaegtcate cccaccttcc tc 

<210> 14 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR Primer 



<400> 14 

gctacacacg tgctacaatg 



20 



<210> 15 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 15 

cgagttgcag actgcgatcc g 21 

<210> 16 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 17 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 17 

gacgggcggt gtgtacaag 19 

<210> 18 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210>19 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 



<400> 16 

aagtcggaat cgctagtaat eg 



22 



<400> 18 

tgaacgctgg tggcatgett aacac 



25 
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<223> PCR Primer 
<400> 19 

tacgcattac tcacccgtcc gc 22 

<210> 20 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 20 

gtggcatgcc taatacatgc aagtcg 26 

<210> 21 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 22 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 22 

taacacatgc aagtcgaacg 20 

<210> 23 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 24 
<211> 18 
<212> DNA 
<213> Artificial 

<220> 

<223> PCR Primer 



<400> 21 

ttactcaccc gtccgccgct 



20 



<400> 23 

ttactcaccc gtccgcc 



17 
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<400> 24 

gtgtagcggt gaaatgcg 



18 



<210> 25 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 25 

gtatctaatc ctgtttgctc cc 22 

<210> 26 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 28 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 28 

ggattagaga ccctggtagt cc 22 

<210> 29 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<400> 26 

agaacaccga tggcgaaggc 



20 



<210> 27 
<211> 21 

<212> DNA 1 

<213> Artificial Sequence 



<220> 

<223> PCR Primer 



<400> 27 

cgtggactac cagggtatct a 



21 



<400> 29 

ggccgtactc cccaggcg 



18 
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<210> 30 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 30 

ggattagata ccctggtagt ccacgc 26 

<210> 31 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 32 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 32 

tagataccct ggtagtccac gc 22 

<210> 33 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 34 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer . .... 

<400> 34 

ttcgatgcaa cgcgaagaac ct 22 



<400> 31 

ggccgtactc cccaggcg 



18 



<400> 33 
cgtactcccc aggcg 



15 



<210> 35 
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<2U> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 35 

acgagctgac gacagccatg 

<210> 36 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 36 

acgcgaagaa ccttacc 

<210> 37 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 37 

acgacacgag ctgacgac 

<210> 38 
<211> 18 
<212> DNA 
<213> Artificial 



<220> 

<223> PCR Primer 
<400> 38 

ctgacacctg cccggtgc 

<210> 39 
<211> 19 
<212> DNA 

<213> Artificial Sequence 



<400> 39 

gaccgttata gttacggcc 

<210> 40 
<211> 25 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 40 

tctgtcccta gtacgagagg accgg 

<210> 41 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 41 

tgcttagatg ctttcagc 

<210> 42 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<40O> 42 

ctgtccctag tacgagagga ccgg 

<210> 43 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 43 

gtttcatgct tagatgcttt cage 

<210> 44 
<211> 26 
<212> DNA 

<213> Artificial Sequence 



<400> 44 

ggggagtgaa agagatcctg aaaccg 

<210> 45 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR Primer 
<400> 45 

acaaaaggta cgccgtcacc c 21 

<210> 46 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 46 

ggggagtgaa agagatcctg aaaccg 26 

<210> 47 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 47 

acaaaaggca cgccatcacc c 21 

<210> 48 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 48 

cgagagggaa acaacccaga cc 22 

<210> 49 
<2ll> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 49 

tggctgcttc taagccaac 19 

<210> 50 
<211>. 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
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<400> 50 

tgctcgtggt gcacaagtaa cggatatta 



29 



<210> 51 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 51 

tgctgctttc goatggttaa ttgcttcaa 29 

<210> 52 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 53 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 53 

tcaagcgcca tttcttttgg taaaccacat 30 

<210> 54 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 55 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<400> 52 

caaaacttat taggtaagcg tgttgact 



28 



<400> 54 

caaaacttat taggtaagcg tgttgact 



28 



<400> 55 
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tcaagogcca tctctttcgg taatccacat 



30 



<210> 56 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 56 

taagaagccg gaaaccatca actaccg 27 

<210> 57 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 58 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 58 

tgattctggt gcccgtggt 19 

<210> 59 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 60 
<211> 19 
<212> DNA 

<213> Artificial Sequence 

<22Q> ... 
<223> PCR Primer 

<400> 60 

tgattccggt gcccgtggt 19 



<400> 57 



ggcgcttgta cttaccgcac 



20 



<400> 59 

ttggccatca ggccacgcat ac 



22 
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<210> 61 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> PCR Primer 



<400> 61 

ttggccatca gaccacgcat ac 

<210> 62 
<211> 24 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> PCR Primer 



<400> 62 

ctggcaggta tgcgtggtct gatg 

<210> 63 
<211> 25 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> PCR Primer 



<400> 63 

cgcaccgtgg gttgagatga agtac 

<210> 64 
<211> 24 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> PCR Primer 



<400> 64 

cttgctggta tgcgtggtct gatg 

<210> 65 
<211> 25 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> PCR Primer 
<400> 65 

cgcaccatgc gtagagatga agtac 



<210> 66 
<211> 26 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 66 

cgtcgggtga ttaaccgtaa caaccg 

<210> 67 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 67 

gtttttcgtt gcgtacgatg atgtc 

<210> 68 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 69 

cgtcgtgtaa ttaaccgtaa caaccg 

<210> 69 
<211> 27 
<212> DNA 

<213> Artificial Sequence 



<400> 69 

acgtttttcg ttttgaacga taatgct 

<210> 70 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 70 

caaaggtaag caaggtcgtt tccgtca 

<210> 71 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR Primer 
<40O> 71 

cgaacggcct gagtagtcaa cacg 

<210> 72 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<40O> 72 

caaaggtaag caaggacgtt tccgtca 

<210> 73 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 73 

cgaacggcca gagtagtcaa cacg 

<210> 74 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 74 

tagactgccc aggacacgct g 

<210> 75 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 75 

gccgtccatc tgagcagcac c 

<210> 7.6 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> PCR Primer 



<400> 76 

ttgactgccc aggtcacgct g 



21 



<210> 77 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 77 

gccgtccatt tgagcagcac c 21 

<210> 78 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 79 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 79 

gttgtcgcca ggcataacca tttc 24 

<210> 80 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 81 
<2H> 24 
<212> DNA . 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<400> 78 



aactaccgtc cgcagttcta cttcc 



25 



<400> 80 

aactaccgtc ctcagttcta cttcc 



25 
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<400> 81 

gttgtcacca ggcattacca tttc 



24 



<210> 82 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 82 

ccacagttct acttccgtac tactgacg 28 

<210> 83 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 84 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 84 

gacctacagt aagaggttct gtaatgaacc 30 

<210> 85 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 86 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 86 

catccacacg gtggtggtga agg 23 



<400> 83 

tccaggcatt accatttcta ctccttctgg 



30 



<400> 85 

tccaagtgct ggtttacccc atgg 



24 
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<210> 87 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2235 PCR Primer 
<40O> 87 

gtgctggttt accccatgga gt 

<210> 88 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 88 

cgtgttgact attcggggcg ttcag 

<210> 89 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 8? 

attcaagagc catttctttt ggtaaaccac 

<210> 90 
<211> 29 
<212> DNA 

<213> Artificial Sequence 



<400> 90 

tcaacaacct cttggaggta aagctcagt 

<210> 91 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 91 

tttcttgaag agtatgagct gctccgtaag 
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<2U> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 92 

catccacacg gtggtggtga agg 

<210> 93 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 93 

tgttttgtat ccaagtgctg gtttacccc 

<210> 94 
<211> 20 
<212> DNA 



23 



29 



<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 94 

cgtggcggcg tggttatcga 20 

<210> 95 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 95 

cggtacgaac tggatgtcgc cgtt 24 

<210> 96 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<40Q> 96 

tatcgctcag gcgaactcca ac 22 



<210> 97 
<211> 21 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> PCR Primer 



<400> 97 

gctggattcg cctttgctac g 



21 



<210> 98 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 98 

tgtaatgaac cctaatgacc atccacacgg 30 

<210> 99 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<210> 100 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 100 

taatgaaccc taatgaccat ccacacggtg 30 

<210> 101 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 101 

tccaagtgct ggtttacccc atggag 26 

<210> 102' ■ • 

<211> 29 
<212> DNA 

<213> Artificial Sequence 



<400> 99 

ccaagtgctg gtttacccca tggagta 



27 
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<220> 

<223> PCR Primer 
<400> 102 

cttggaggta agtctcattt tggtgggca 

<210> 103 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 103 

cgtataagct gcaccataag cttgtaatgc 

<210> 104 
<211> 18 
<212> DNA 



29 



30 



<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 104 

cgacgcgctg cgcttcac 18 

<210> 105 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 105 

gcgttccaca gcttgttgca gaag 24 

<210> 106 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 106 

gaccacctcg gcaaccgt 18 
<210> 107 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
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<400> 107 

ttcgctctcg gcctggcc 

<210> 108 
<211> 27 
<212> DNA 

<213> Artificial Sequence 



<400> 108 

goactatgca cacgtagatt gtcctgg 

<210> 109 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 109 

tatagcacca tccatctgag cggcac 

<210> 110 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 110 

cggcgtactt caacgacagc ca 

<210> 111 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 111 

cgcggtcggc tcgttgatga 

<210> 112 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<400> 112 
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cttctgoaac aagctgtgga acgc 

<210> 113 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 113 . 

tcgcagttca tcagcacgaa gcg 

<210> 114 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 114 

aagacgacct gcacgggc 

<210> 115 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 115 

gcgctccacg tcttcacgc 

<210> 116 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 116 

ctgttcttag tacgagagga cc 

<210> 117 
<211> 21 
<212> DNA 

<213> Artificial Sequence 

<220> . . 
<223> PCR Primer 

<400> 117 

ttcgtgctta gatgctttca g 
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<210> 118 

<:211> 17 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 118 

acgcgaagaa ccttacc 

<210> 119 

<211> 18 

' <212> DNA 

<213> Artificial Sequence 

<220> 

<223> PCR Primer 

<400> 119 

acgacacgag ctgacgac 

<210> 120 

<211> 14 

<212> DNA 

<213> Artificial Sequence 
<220> . 

<223> PCR Primer 

<400> 120 
cgaagaacct tacc 

<210> 121 

<211> 12 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 121 
acacgagctg ac 

<210> 122 

<211> 14 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 122 
cgaagaacct tacc 



<210> 123 
<211> 12 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 123 

acacgagctg ac 12 

<210> 124 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 124 

cc'tgataagg gtgaggtcg 19 

<210> 125 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 125 

acgtccttca tcgcctctga 20 

<210> 126 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 126 

gttgtgaggt taagcgacta ag 22 

<210> 127 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 127 

ct.atcggtca gtcaggagta t 21 

<210> 128 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR Primer 
<400> 128 

gttgtgaggt taagcgacta ag 

<210> 129 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<400> 129 

ttgcatcggg ttggtaagtc 

<210> 130 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 130 

atactcctga ctgaccgata g 

<210> 131 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 131 

aacatagcct tctccgtcc 

<210> 132 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 132 

gacttaccaa cccgatgcaa 

<2.10> 133 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> PCR Primer 
<400> 133 

taccttagga ccgttatagt tacg 

<210> 134 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 134 

ggacggagaa ggctatgtt 

<210> 135 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 135 

ccaaacaccg ccgtcgatat 

<210> 136 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer • 
<400> 136 

cgtaactata acggtcctaa ggta 

<210> 137 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 137 

gcttacacac ccggcctatc 

<210> 138 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
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<400> 138 

atatcgacgg cggtgtttgg 

<210> 139 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 139 

gcgtgacagg caggtattc 

<210> 140 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



<400> 140 

agtctcaaga gtgaacacgt aa 

<210> 141 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 141 

gctgctggca cggagtta 

<210> 142 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 142 

gacacggtcc agactcctac 

<210> 143 
<211> 18 
<212> DNA 

<213> Artificial Sequence 

' <220> 
<223> PCR Primer 

<400> 143 

ccatgcagca cctgtctc 
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<210> 144 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<400> 144 

gatctggagg aataccggtg 

<210> 145 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 145 

acggttacct tgttacgact 

<210> 146 
<211> 20 
<212>, DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 146 

gagagcaagc ggacctcata 

<210> 147 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 147 

cctcctgcgt gcaaagc 

<210> 148 
<211> 19 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer. . 
<4O0> 148 

tggaagatct gggtcaggc 
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<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 149 

caatctgctg acggatctga gc 

<210> 150 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 150 

gtcgtgaaaa cgagctggaa ga 

<210> 151 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 151 

catgatggtc acaaccgg 

<210> 152 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 152 

tggcgaacct ggtgaacgaa gc 

<210> 153 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 153 

ctttcgcttt ctcgaactca accat 

<210> 154 
<211> 26 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 154 

cgtcagggta aattccgtga agttaa 

<210> 155 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



<400> 155 

aacttcgcct tcggtcatgt t 

<210> 156 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 156 

ggtgaaagaa gttgcctc'ta aagc 

<210> 157 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 157 

ttcaggtcca tcgggttcat gcc 

<210> 158 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<400> 158 

cgtggcggcg tggttatcga 

<210> 159 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR Primer 
<400> 159 

acgaactgga tgtcgccgtt 

<210> 160 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 160 

cggaattact gggcgtaaag 

<210> 161 
<2U> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 161 

cgcatttcac cgctacac 

<210> 162 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 162 

acccagtgct gctgaaccgt gc 

<210> 163 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 163 

gttcaaatgc ctggataccc a 

<210> 164 
<211> 22 . 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



35 



WO 2004/060278 PCT/US2003/038761 



<400> 164 

gggagcaaac aggattagat ac 

<210> 165 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 165 
cgtactcccc aggcg 

<210> 166 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 166 

tggcccgaaa gaagctgagc g 

<210> 167 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<4O0> 167 

acgcgggcat gcagagatgc c 

<210> 168 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 168 

atgttgggtt aagtcccgc 

<210> 169 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
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tgacgtcatc cccaccttcc 

<210> 170 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 170 

cttgtacaca ccgcccgtc 

<210> 171 
<211> 17 
<212> DMA 

<213> Axtificial Sequence 
<220> 

<223> PCR Primer 
<400> 171 

aaggaggtga tccagcc 

<210> 172 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



<400> 172 

cggattggag tctgcaactc g 

<210> 173 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 173 

gacgggcggt gtgtacaag 

<210> 174 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 174 
ggtggatgcc ttggc 
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<210> 175 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 175 
gggtttcccc attcgg 

<210> 176 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 176 

gggaactgaa acatctaagt a 

<210> 177 
<211> 15 
<212> DNA 

<213> Artificial Sequence 



<400> 177 
ttcgctcgcc gctac 

<210> 178 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 178 

taccccaaac cgacacagg 

<210> 179 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 179 

ccttctcccg aagttacg 

<210> 180 
<211> 19 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 180 

ccgtaacttc gggagaagg 

<210> 181 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 181 
caccgggcag gcgtc 

<210> 182 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 182 
gacgcctgcc cggtgc 

<210> 183 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 183 

ccgacaagga atttcgctac c 

<210> 184 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 184 

aaggtactcc ggggataaca. gg.c 

<210> 185 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<400> 185 

agccgacatc gaggtgccaa ac 

<210> 186 
<211> 18 
<212> DNA 

<213> Artificial Sequence 



<400> 186 

gacagttcgg tccc'tatc 

<210> 187 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 187 

ccggtcctct cgtacta 

<210> 188 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 188 

tagtacgaga ggaccgg 

<210> 189 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 189 

ttagatgctt tcagcactta tc 

<21Q> 190 ... 
<211> 24 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> ECR Primer 
<400> 190 

aaactagata acagtagaca tcac 

<210> 191 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 191 

gtgcgccctt tctaactt 

<210> 192 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 192 

agagtttgat catggctcag 

<210> 193 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<40O> 193 

actgctgcct cccgtag 

<210> 194 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<40O> 194 

cactggaact gagacacgg 

<210> 195 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
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<400> 195 

ctttacgccc agtaattccg 

<210> 196 
<211> 19 
<212> DNA 

<213> Artificial Sequence 



<400> 196 

ccagcagccg cggtaatac 

<210> 197 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 197 

gtatctaatc ctgtttgctc cc 

<210> 198 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<4O0> 198 

gtgtagcggt gaaatgcg 

<210> 199 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 199 

ggtaaggttc ttcgcgttg 

<210> 200 
<211> 18 
<212> DNA 

<213> Artificial Sequence 



<220* 

<223> PCR Primer 



<400> 200 

aagcggtgga gcatgtgg 
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<210> 201 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<4O0> 201 

attgtagcac gtgtgtagcc c 

<210> 202 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 202 

caagtcatca tggccctta 

<210> 203 
<211> 17 
<212> DNA 

<213> Artificial Sequence 



<400> 203 

aaggaggtga tccagcc 

<210> 204 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 204 

agagtttgat catggctcag 

<210> 205 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 205 

aaggaggtga tccagcc 



<210> 206 
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<211> 19 
<212> DNA 

<213> Artificial Sequence 



<4O0> 206 

acctgcccag tgctggaag 

<210> 207 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<4O0> 207 

tcgctacctt aggaccgt 

<210> 208 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 208 

gccttgtaca cacctcccgt c 

<210> 209 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 209 

cacggctacc ttgttacgac 

<210> 210 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<4Q0> 210 

ttgtacacac cgcccgtcat ac 



<210> 211 
<211> 20 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 211 

ccttgttacg acttcacccc 

<210> 212 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 212 

tacggtgaat acgttcccgg g 

<210> 213 • 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 213 

accttgttac gacttcaccc ca 

<210> 214 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 214 

accacgccgt aaacgatga 

<210> 215 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



<400> 215 

cccccgtcaa ttcctttgag t 

<210> 216 ' 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR Primer 
<400> 216 

gataccctgg tagtccacac eg 

<210> 217 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 217 

gccttgcgac cgtactccc 

<210> 218 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 218 

tagataccct ggtagtccac gc 

<210> 219 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 219 

gcgaccgtac tccccagg 

<210> 220 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 220 

tagtcccgca aegagege 

<210> 221 
<2U> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223>PCR Primer 
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<400> 221 

gacgtcatcc ccaccttcct cc 

<210> 222 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 222 

tagaacgtcg cgagacagtt eg 

<210> 223 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<400> 223 

agtccatccc ggtcctctcg 

<210> 224 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 224 

gaggaaagtc egggetc 

<210> 225 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 225 

ataagceggg ttctgtcg 

<210> 226 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
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gaggaaagtc catgctcgc 

<210> 227 
<211> 22 
<212> DKA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 227 

gtaagccatg ttttgttcca tc 

<210> 228 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 228 

gaggaaagtc cgggctc 

<210> 229 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 229 

ataagccggg ttctgtcg 

<210> 230 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 230 

gcgggatcct ctagaggtgt taaatagcct ggcag 

<210> 231 
<211> 35 
<212> DNA 

<213> Artificial Sequence 



<400> 231 

gcgggatcct ctagaagacc tcctgcgtgc aaagc 
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<210> 232 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 232 

gaggaaagtc catgctcac 

<210> 233 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 233 

ataagccatg ttctgttcca tc 

<210> 234 
<211> 19 
<212> DNA 

<213> Artificial Sequence 



<400> 234 

atgttgggtt aagtcccgc 
<210> 235 

<2ii> n 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 235 

aaggaggtga tccagcc 

<210> 236 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<400> 236 

cggaattact gggcgtaaag 



<210> 237 
<211> 22 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 237 

gtatctaatc ctgtttgctc cc 22 

<210> 238 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 238 

atgttgggtt aagtcccgc 19 

<210> 239 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 239 

tgacgtcatg cccaccttcc 20 

<210> 240 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 240 

atgttgggtt aagtcccgc 19 

<210> 241 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 241 

tgacgtcatg gccaccttcc .. 20 

<210> 242 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
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<400> 242 

gcgggatcct ctagacctga taagggtgag gtcg 

<210> 243 

<211> 34 

<212> DNA 

<213> Artificial Sequence 



<400> 243 

gcgggatcct ctagagcgtg acaggcaggt attc 

<210> 244 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 244 

acgcgaagaa ccttacc 

<210> 245 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 245 

gacgggcggt gtgtacaag 

<210> 246 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer. 
<400> 246 

gtgtagcggt gaaatgcg 

<210> 247 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> PCR Primer 
<400> 247 

cgagttgcag actgcgatcc g 

<210> 248 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 248 

taacacatgc aagtcgaacg 

<210> 249 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 249 
cgtactcccc aggcg 

<210> 250 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 250 

taacacatgc aagtcgaacg 

<210> 251 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 251 

acgacacgag ctgacgac 

<210> 252 
<211> 22 
<212> DNA . 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
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<400> 252 

acaacgaagt acaatacaag ac 

<210> 253 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



<400> 253 

cttctacatt tttagccatc ac 

<210> 254 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 254 

ttaagtcccg caacgagcgc aa 

<210> 255 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 255 

tgacgtcatc cccaccttcc tc 

<210> 256 
<211> 28 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 256 

tgagtgatga aggccttagg gttgtaaa 

<210> 257 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 257 

cggctgctgg cacgaagtta g 
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<210> 258 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 258 

atggacaagg ttggcaagga agg 

<210> 259 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<40O> 259 

tagccgcggt cgaattgcat 

<210> 260 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 260 

aaggaaggcg tgatcaccgt tgaaga 

<210> 261 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 261 

ccgcggtcga attgcatgcc ttc 

<210> 262 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 262 
acgcgctgcg cttcac 

<210> 263 
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<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<4O0> 263 

ttgcagaagt tgcggtagcc 

<210> 264 
<211> 18 
<212> DNA 

<213> Artificial Sequence 



<400> 264 

tcgaccacct gggcaacc 

<210> 265 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 265 

atcaggtcgt gcggcatca 

<210> 266 
<211> 17 
<212> DNA 

<213> Artificial Sequence 



<400> 266 

cacggtgccg gcgtact 

<210> 267 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 267 

gcggtcggct cgttgatgat 

<210> 268 
<211> 25 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 268 

ttggaggtaa gtctcatttt ggtgg 

<210> 269 
<211>. 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 269 

aagctgcacc ataagcttgt aatgc 

<210> 270 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 270 

cagcgtttcg gcgaaatgga 

<210> 271 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 271 

cgacttgacg gttaacattt cctg 

<210> 272 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 272 

gggcagcgtt tcggcgaaat gga 

<210> 273 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR Primer 
<400> 273 

gtccgacttg acggtcaaca tttcctg 

<210> 274 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 274 

caggagtcgt tcaactcgat ctacatgat 

<210> 275 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 275 

acgccatcag gccacgcat 

<210> 276 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 276 

gcacaacctg cggctgcg 

<210> 277 
<211> 18 
<212> DNA 

<213> Artificial Sequence 



<400> 277 

acggcacgag gtagtcgc 

<210> 278 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> PCR Primer 
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tgttactcac ccgtctgcca ct 

<210> 284 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 284 

accgageaag gagacoagc 

<210> 285 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 285 

tataacgcac atcgtcaggg tga 

<210> 286 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<400> 286 

agacccaatt acattggctt 

<210> 287 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 287 

ccagtgctgt tgtagtacat 

<210> 288 
<211> 20 
<212> DNA 

<213> Artificial Sequence 

<220> ... 
<223> PCR Primer 

<400> 288 

atgtactaca acagtactgg 
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<210> 289 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 289 

caagtcaacc acagcattca 

<210>' 290 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<4O0> 290 

gggcttatgt actacaacag 

<210> 291 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 291 

tctgtcttgc aagtcaacca c 

<210> 292 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 292 

ggaatttttt gatggtagag a 

<210> 293 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 293 

taaagcacaa tttcaggcg 



<210> 294 
<211> 20 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<4O0> 294 

tagatctggc tttctttgac 

<210> 295 
<211> 21 
<212> DNA 
<213> Artificial 



<220> 

<223> PCR Primer 
<400> 295 

atatgagtat ctggagtctg c 

<210> 296 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



<400> 296 

ggaaagacat tactgcagac a 

<210> 297 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 297 

ccaacttgag gctctggctg 

<210> 298 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 298 

acagacactt accagggtg 

<210> 299 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR Primer 
<400> 299 

actgtggtgt catctttgtc 

<210> 300 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



<400> 300 

tcactaaaga caaaggtctt cc 

<210> 301 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 301 

ggcttcgccg tctgtaattt c 

<210> 302 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



<400> 302 

cggatccaag ctaatctttg g 

<210> 303 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 303 

ggtatgtact cataggtgtt ggtg 

<210> 304 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> PCR Primer 
<400> 304 

agacccaatt acattggctt 

<210> 305 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 305 

ccagtgctgt tgtagtacat 

<210> 306 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<400> 306 

atgtactaca acagtactgg 

<210> 307 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 307 

caagtcaacc acagcattca 

<210> 308 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 308 

gggcttatgt actacaacag 

<210> 309 
<211> 21 
<212>. DNA . 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
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<40O> 309 

tctgtcttgc aagtcaacca c 

<210> 310 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 310 

ggaatttttt gatggtagag a 

<210> 311 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<40O> 311 

taaagcacaa tttcaggcg 

<210> 312 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<40O> 312 

tagatctggc tttctttgac 

<210> 313 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<40O> 313 

atatgagtat ctggagtctg c 

<210> 314 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<400> 314 

cggatccaag ctaatctttg g 
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<210> 315 
<211> 24 
<212> DNA 

<213> Artificial Sequence 



<40O> 315 

ggtatgtact cataggtgtt ggtg 

<210> 316 
<211> 23 
<212> DNA 

<213> Artificial Sequence 



<400> 316 

aacagaccca attacattgg ctt 

<210> 317 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 317 

gaggcacttg tatgtggaaa gg 

<210> 318 
<211> 23 
<212> DNA 

<213> Artificial Sequence 



<400> 318 

atgcctaaca gacccaatta cat 

<210> 319 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Pr-imer - 
<400> 319 

ttcatgtagt cgtaggtgtt gg 



<210> 320 



WO 2004/0tf0278 



PCT/US2003/038761 



<220> 

<223> PCR Primer 
<400> 330 

ggtcgttatg tgcctttcca cat 23 

<210> 331 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 331 

tcctttctga agttccactc atagg 25 

<210> 332 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 332 > 

acaacattgg ctaccagggc tt 22 

<210> 333 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 333 

cctgcctgct cataggctgg aagtt 25 

<210> 334 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer i 
<400> 334 

ggattagaga ccctggtagt cc 22 
<210> 335 

<211> 18 ... 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
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<400> 335 

ggccgtactc cccaggcg 

<210> 336 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 336 

ttcgatgcaa cgcgaagaac ct 

<210> 337 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 337 

acgagctgac gacagccatg 

<210> 338 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 338 

tctgtcccta gtacgagagg accgg 

<210> 339 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 339 

tgcttagatg ctttcagc 

<210> 340 
<211> 24 
<212> DNA 

<213> .Artificial Sequence 
<220> 

<223> PCR Primer 



<400> 340 
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ct'ggcaggta tgcgtggtct gatg 

<210> 341 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 341 

cgcaccgtgg gttgagatga agtac 

<210> 342 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 342 

ggggattcag ccatcaaagc agctattgac 

<210> 343 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 343 

ccaacctttt ccacaacaga atcagc 

<210> 344 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 344 

ccttacttcg aactatgaat cttttggaag 

<210> 345 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 



<400> 345 

cccatttttt cacgcatgct gaaaatatc 
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<210> 346 

<211> 23 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 346 

cgcaaaaaaa tccagctatt age 

<210> 347 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 347 

aaactatttt tttagctata ctcgaacac 

<210> 348 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 348 

atgattacaa ttcaagaagg tcgtcacgc 

<210> 349 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 349 

ttggacctgt aatcagctga atactgg 

<210> 350 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 350 

gatgactttt tagctaatgg tcaggcagc 



<210> 351 
<211> 29 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 351 

aatcgacgac catcttggaa agatttctc 

<210> 352 
<211> 25 
<212> DNA 



29 



<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 352 

gcttcaggaa tcaatgatgg agcag 25 

<210> 353 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 353 

gggtctacac ctgcacttgc ataac 25 

<210> 354 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<4O0> 354 

gtactgaatc cgcctaag 18 

<210> 355 ' 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 355 

gtgaataaag tatcgcccta ata . . . 23 

<210> 356 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> PCR Primer 
<400> 356 

gaagttgaac cgggatca 

<210> 357 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



<400> 357 

attatcggtc gttgttaatg t 

<210> 358 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 358 

ctgtctgtag ataaactagg att 

<210> 359 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 359 

cgttcttctc tggaggat 

<210> 360 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 360 
cgatactacg gacgc 

<210> 361 
<211>22 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<210> 372 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 

<400> 372 
cgatactacg gacgc 

<210> 373 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<400> 373 

ctttatgaat tactttacat at 

<210> 374 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer 
<4O0> 374 

ctcctccatc actaggaa 

<210> 375 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



<4O0> 375 

ctataacatt caaagcttat tg 

<210> 376 
<211> 23 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> PCR Primer 



<400> 376 

cgcgataata gatagtgcta aac 
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<2xl.> 19 
<212> QNA 

<213> Artificial Sequence 
<220> 

<223> PCR Prin^P% 
<400> 377 

gcttccacca ggtcattaa 
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